algebraic quantum field theory (perturbative, on curved spacetimes, homotopical)
field theory: classical, pre-quantum, quantum, perturbative quantum
this entry is going to be one chapter of geometry of physics
under construction

These notes mean to give an expository but rigorous introduction to the basic concepts of relativistic perturbative quantum field theories, specifically those that arise as the perturbative quantization of a Lagrangian field theory – such as quantum electrodynamics, quantum chromodynamics, and perturbative quantum gravity appearing in the standard model of particle physics.
For broad introduction of the idea of the topic of perturbative quantum field theory see there and see
Here, first we consider classical field theory (or rather pre-quantum field theory), complete with BV-BRST formalism; then its deformation quantization via causal perturbation theory to perturbative quantum field theory. This mathematically rigorous (i.e. clear and precise) formulation of the traditional informal lore has come to be known as perturbative algebraic quantum field theory.
We aim to give a fully local discussion, where all structures arise on the “jet bundle over the field bundle” (introduced below) and “transgress” from there to the spaces of field histories over spacetime (discussed further below). This “Higher Prequantum Geometry” streamlines traditional constructions and serves the conceptualization in the theory. This is joint work with Igor Khavkine.
In full beauty these concepts are extremely general and powerful; but the aim here is to give a first precise idea of the subject, not a fully general account. Therefore we concentrate on the special case where spacetime is Minkowski spacetime (def. 23 below), where the field bundle (def. 34 below) is an ordinary trivial vector bundle (example 9 below) and hence the Lagrangian density (def. 60 below) is globally defined. Similarly, when considering gauge theory we consider just the special case that the gauge parameter-bundle is a trivial vector bundle and we concentrate on the case that the gauge symmetries are “closed irreducible” (def. 23 below). But we aim to organize all concepts such that the structure of their generalization to curved spacetime and non-trivial field bundles is immediate.
This comparatively simple setup already subsumes what is considered in traditional texts on the subject; it captures the established perturbative BRST-BV quantization of gauge fields coupled to fermions on curved spacetimes – which is the state of the art. Further generalization, necessary for the discussion of global topological effects, such as instanton configurations of gauge fields, will be discussed elsewhere (see at homotopical algebraic quantum field theory).
Alongside the theory we develop the concrete examples of the real scalar field, the electromagnetic field and the Dirac field:
running examples
| field | field bundle | Lagrangian density | equation of motion |
|---|---|---|---|
| real scalar field | expl. 10 | expl. 39 | expl. 45 |
| Dirac field | expl. 35 | expl. 43 | expl. 52 |
| electromagnetic field | expl. 11 | expl. 40 | expl. 46 |
| Yang-Mills field | expl. 12, expl. 13 | expl. 41 | expl. 47 |
| B-field | expl. 14 | expl 42 | expl. 48 |
| field | Poisson bracket | causal propagator | Hadamard propagator | Feynman propagator |
|---|---|---|---|---|
| real scalar field | expl. 73, expl. 76 | prop. 65 | def. 107 | def. 108 |
| Dirac field | expl. 73, expl. 49 | prop. 76 | def. 109 | def. 110 |
| field | gauge symmetry | local BRST complex | gauge fixing |
|---|---|---|---|
| electromagnetic field | expl. 92 | expl. 79 | expl. 106 |
| Yang-Mills field | expl. 93 | … | … |
| B-field | … | … | … |
The electromagnetic field and the Dirac field combined are the fields of quantum electrodynamics which we turn to at the end below.
Acknowledgement
These notes profited greatly from discussions with Igor Khavkine.
Thanks also to Marco Benini, Klaus Fredenhagen, Arnold Neumaier, Kasia Rejzner for helpful discussion.
The geometry of physics is differential geometry. This is the flavor of geometry which is modeled on Cartesian spaces with smooth functions between them. Here we briefly review the basics of differential geometry on Cartesian spaces.
In principle the only background assumed of the reader here is
usual naive set theory (e.g. Lawvere-Rosebrugh 03);
the concept of the continuum: the real line , the plane , etc.
the concepts of differentiation and integration of functions on such Cartesian spaces;
hence essentially the content of multi-variable differential calculus.
We now discuss:
As we uncover Lagrangian field theory further below, we discover ever more general concepts of “space” in differential geometry, such as smooth manifolds, diffeological spaces, infinitesimal neighbourhoods, supermanifolds, Lie algebroids and super Lie ∞-algebroids. We introduce these incrementally as we go along:
more general spaces in differential geometry introduced further below
| higher differential geometry | |||||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|
| differential geometry | smooth manifolds (def. 44) | diffeological spaces (def. 35) | smooth sets (def. 36) | formal smooth sets (def. 40) | super formal smooth sets (def. 48) | super formal smooth ∞-groupoids (not needed in fully perturbative QFT) | |||||
| infinitesimal geometry, Lie theory | infinitesimally thickened points (def. 39) | superpoints (def. 46) | Lie ∞-algebroids (def. 114) | ||||||||
| higher Lie theory | |||||||||||
| needed in QFT for: | spacetime (def. 23) | space of field histories (def. 16) | Cauchy surface (def. 87), perturbation theory (def. 84) | Dirac field (expl. 35), Pauli exclusion principle | infinitesimal gauge symmetry/BRST complex (expl. 97) |
Abstract coordinate systems
What characterizes differential geometry is that it models geometry on the continuum, namely the real line , together with its Cartesian products , regarded with its canonical smooth structure (def. 1 below). We may think of these Cartesian spaces as the “abstract coordinate systems” and of the smooth functions between them as the “abstract coordinate transformations”.
We will eventually consider below much more general “smooth spaces” than just the Cartesian spaces ; but all of them are going to be understood by “laying out abstract coordinate systems” inside them, in the general sense of having smooth functions mapping a Cartesian space smoothly into them. All structure on generalized smooth spaces is thereby reduced to compatible systems of structures on just Cartesian spaces, one for each smooth “probe” . This is called “functorial geometry”.
Notice that the popular concept of a smooth manifold (def./prop. 44 below) is essentially that o a smooth space which locally looks just like a Cartesian space, in that there exist sufficiently many which are (open) isomorphisms onto their images. Historically it was a long process to arrive at the insight that it is wrong to fix such local coordinate identifications , or to have any structure depend on such a choice. But it is useful to go one step further:
In functorial geometry we do not even focus attention on those that are isomorphisms onto their image, but consider all “probes” of by “abstract coordinate systems”. This makes differential geometry both simpler as well as more powerful. The analogous insight for algebraic geometry is due to Grothendieck 65; it was transported to differential geometry by Lawvere 67.
This allows to combine the best of two superficially disjoint worlds: On the one hand we may reduce all constructions and computations to coordinates, the way traditionally done in the physics literature; on the other hand we have full conceptial control over the coordinate-free generalized spaces analyzed thereby. What makes this work is that all coordinate-constructions are functorially considered over all abstract coordinate systems.
(Cartesian spaces and smooth functions between them)
For we say that the set of n-tuples of real numbers is a Cartesian space. This comes with the canonical coordinate functions
which send an n-tuple of real numbers to the th element in the tuple, for .
For
any function between Cartesian spaces, we may ask whether its partial derivative along the th coordinate exists, denoted
If this exists, we may in turn ask that the partial derivative of the partial derivative exists
and so on.
A general higher partial derivative obtained this way is, if it exists, indexed by an n-tuple of natural numbers and denoted
where is the total order of the partial derivative.
If all partial derivative to all orders of a function exist, then is called a smooth function.
Of course the composition of two smooth functions is again a smooth function.
The inclined reader may notice that this means that Cartesian spaces with smooth functions between them constitute a category (“CartSp”); but the reader not so inclined may ignore this.
For the following it is useful to think of each Cartesian space as an abstract coordinate system. We will be dealing with various generalized smooth spaces (see the table below), but they will all be characterized by a prescription for how to smoothly map abstract coordinate systems into them.
(coordinate functions are smooth functions)
Given a Cartesian space , then all its coordinate functions (def. 1)
are smooth functions (def. 1).
For
any smooth function and write
. for its composition with this coordinate function.
(algebra of smooth functions on Cartesian spaces)
For each , the set
of real number-valued smooth functions on the -dimensional Cartesian space (def. 1) becomes a commutative associative algebra over the ring of real numbers by pointwise addition and multiplication in : for and
.
The inclusion
is given by the constant functions.
We call this the real algebra of smooth functions on :
If
is any smooth function (def. 1) then pre-composition with (“pullback of functions”)
is an algebra homomorphism. Moreover, this is clearly compatible with composition in that
Stated more abstractly, this means that assigning algebras of smooth functions is a functor
from the category CartSp of Cartesian spaces and smooth functions between them (def. 1), to the opposite of the category Alg of -algebras.
(local diffeomorphisms and open embeddings of Cartesian spaces)
A smooth function from one Cartesian space to itself (def. 1) is called a local diffeomorphism, denoted
if the determinant of the matrix of partial derivatives (the “Jacobian” of ) is everywhere non-vanishing
If the function is both a local diffeomorphism, as above, as well as an injective function then we call it an open embedding, denoted
(good open cover of Cartesian spaces)
For a Cartesian space (def. 1), a differentiably good open cover is
an indexed set
of open embeddings (def. 2)
such that the images
satisfy:
(open cover) every point of is contained in at least one of the ;
(good) all finite intersections are either empty set or themselves images of open embeddings according to def. 2.
The inclined reader may notice that the concept of differentiably good open covers from def. 3 is a coverage on the category CartSp of Cartesian spaces with smooth functions between them, making it a site, but the reader not so inclined may ignore this.
(Fiorenza-Schreiber-Stasheff 12, def. 6.3.9)
Given any context of objects and morphisms between them, such as the Cartesian spaces and smooth functions from def. 1 it is of interest to fix one object and consider other objects parameterized over it. These are called bundles (def. 4) below. For reference, we briefly discuss here the basic concepts related to bundles in the context of Cartesian spaces.
Of course the theory of bundles is mostly trivial over Cartesian spaces; it gains its main interest from its generalization to more general smooth manifolds (def./prop. 44 below). It is still worthwhile for our development to first consider the relevant concepts in this simple case first.
For more exposition see at fiber bundles in physics.
(bundles)
We say that a smooth function (def. 1) is a bundle just to amplify that we think of it as exhibiting as being a “space over ”:
For a point, we say that the fiber of this bundle over is the pre-image
of the point under the smooth function. We think of as exhibiting a “smoothly varying” set of fiber spaces over .
Given two bundles and over , a homomorphism of bundles between them is a smooth function (def. 1) between their total spaces which respects the bundle projections, in that
Hence a bundle homomorphism is a smooth function that sends fibers to fibers over the same point:
The inclined reader may notice that this defines a category of bundles over , which is in fact just the slice category ; the reader not so inclined may ignore this.
(sections)
Given a bundle (def. 4) a section is a smooth function such that
This means that sends every point to an element in the fiber over that point
We write
for the set of sections of a bundle.
For and two bundles and for
a bundle homomorphism between them (def. 4), then composition with sends sections to sections and hence yields a function denoted
For and Cartesian spaces, then the Cartesian product equipped with the projection
to is a bundle (def. 4), called the trivial bundle with fiber . This represents the constant smoothly varying set of fibers, constant on
If is the point, then this is the identity bundle
Given any bundle , then a bundle homomorphism (def. 4) from the identity bundle to is equivalently a section of (def. 5)
A bundle (def. 4) is called a fiber bundle with typical fiber if there exists a differentiably good open cover (def. 3) such that the restriction of to each is isomorphic to the trivial fiber bundle with fiber over . Such diffeomorphisms are called local trivializations of the fiber bundle:
A vector bundle is a fiber bundle (def. 6) with typical fiber a vector space such that there exists a local trivialization whose gluing functions
for all are linear functions over each point .
A homomorphism of vector bundle is a bundle morphism (def. 4) such that there exist local trivializations on both sides with respect to which is fiber-wise a linear map.
The inclined reader may notice that this makes vector bundles over a category (denoted ); the reader not so inclined may ignore this.
(module of sections of a vector bundle)
Given a vector bundle (def. 7), then its set of sections (def. 4) becomes a real vector space by fiber-wise multiplication with real numbers. Moreover, it becomes a module over the algebra of smooth functions (example 2) by the same fiber-wise multiplication:
For and two vector bundles and
a vector bundle homomorphism (def. 7) then the induced function on sections (def. 5)
is compatible with this action by smooth functions and hence constitutes a homomorphism of -modules.
The inclined reader may notice that this means that taking spaces of sections yields a functor
from the category of vector bundles over to that over modules over .
(tangent vector fields and tangent bundle)
For a Cartesian space (def. 1) the trivial vector bundle (example 3, def. 7)
is called the tangent bundle of . With the coordinate functions on (def. 1) we write for the corresponding linear basis of regarded as a vector space. Then a general section (def. 5)
of the tangent bundle has a unique expansion of the form
where a sum over indices is understood (Einstein summation convention) and where the components are smooth functions on (def. 1).
Such a is also called a smooth tangent vector field on .
Each tangent vector field on determines a partial derivative on smooth functions
By the product law of differentiation, this is a derivation on the algebra of smooth functions (example 2) in that
it is an -linear map in that
it satisfies the Leibniz rule
for all and all .
Hence regarding tangent vector fields as partial derivatives constitutes a linear function
from the space of sections of the tangent bundle. In fact this is a homomorphism of -modules (example 4), in that for and we have
Let be a fiber bundle. Then its vertical tangent bundle is the fiber bundle (def. 6) over whose fiber over a point is the tangent bundle (def. 5) of the fiber of over that point:
If is a trivial fiber bundle with fiber , then its vertical vector bundle is the trivial fiber bundle with fiber .
For a vector bundle (def. 7), its dual vector bundle is the vector bundle whose fiber (2) over is the dual vector space of the corresponding fiber of :
The defining pairing of dual vector spaces applied pointwise induces a pairing on the modules of sections (def. 4) of the original vector bundle and its dual with values in the smooth functions (def. 1):
synthetic differential geometry
Below we encounter generalizations of ordinary differential geometry that include explicit “infinitesimals” in the guise of infinitesimally thickened points, as well as “super-graded infinitesimals”, in the guise of superpoints (necessary for the description of fermion fields such as the Dirac field). As we discuss below, these structures are naturally incorporated into differential geometry in just the same way as Grothendieck introduced them into algebraic geometry (in the guise of “formal schemes”), namely in terms of formally dual rings of functions with nilpotent ideals. That this also works well for differential geometry rests on the following three basic but important properties, which say that smooth functions behave “more algebraically” than their definition might superficially suggest:
(the three magic algebraic properties of differential geometry)
embedding of Cartesian spaces into formal duals of R-algebras
For and two Cartesian spaces, the smooth functions between them (def. 1) are in natural bijection with their induced algebra homomorphisms (example 2), so that one may equivalently handle Cartesian spaces entirely via their -algebras of smooth functions.
Stated more abstractly, this means equivalently that the functor that sends a smooth manifold to its -algebra of smooth functions (example 2) is a fully faithful functor:
(Kolar-Slovak-Michor 93, lemma 35.8, corollaries 35.9, 35.10)
embedding of smooth vector bundles into formal duals of R-algebra modules
For and two vector bundle (def. 7) there is then a natural bijection between vector bundle homomorphisms and the homomorphisms of modules that these induces between the spaces of sections (example 4).
More abstractly this means that the functor is a fully faithful functor
(Nestruev 03, theorem 11.29heorem#Nestruev03))
Moreover, the modules over the -algebra of smooth functions on which arise this way as sections of smooth vector bundles over a Cartesian space are precisely the finitely generated free modules over .
(Nestruev 03, theorem 11.32heorem#Nestruev03))
vector fields are derivations of smooth functions.
For a Cartesian space (example 1), then any derivation on the -algebra of smooth functions (example 2) is given by differentiation with respect to a uniquely defined smooth tangent vector field: The function that regards tangent vector fields with derivations from example 5
is in fact an isomorphism.
(This follows directly from the Hadamard lemma.)
Actually all three statements in prop. 1 hold not just for Cartesian spaces, but generally for smooth manifolds (def./prop. 44 below; if only we generalize in the second statement from free modules to projective modules. However for our development here it is useful to first focus on just Cartesian spaces and then bootstrap the theory of smooth manifolds and much more from that, which we do below.
We introduce and discuss differential forms on Cartesian spaces.
(differential 1-forms on Cartesian spaces and the cotangent bundle)
For a smooth differential 1-form on a Cartesian space (def. 1) is an n-tuple
of smooth functions (def. 1), which we think of equivalently as the coefficients of a formal linear combination
on a set of cardinality .
Here a sum over repeated indices is tacitly understood (Einstein summation convention).
Write
for the set of smooth differential 1-forms on .
We may think of the expressions as a linear basis for the dual vector space . With this the differential 1-forms are equivalently the sections (def. 5) of the trivial vector bundle (example 3, def. 7)
called the cotangent bundle of (def. 9):
This amplifies via example 4 that has the structure of a module over the algebra of smooth functions , by the evident multiplication of differential 1-forms with smooth functions:
The set of differential 1-forms in a Cartesian space (def. 9) is naturally an abelian group with addition given by componentwise addition
The abelian group is naturally equipped with the structure of a module over the algebra of smooth functions (example 2), where the action is given by componentwise multiplication
Accordingly there is a canonical pairing between differential 1-forms and tangent vector fields (example 5)
With differential 1-forms in hand, we may collect all the first-order partial derivatives of a smooth function into a single object: the exterior derivative or de Rham differential is the -linear function
Under the above pairing with tangent vector fields this yields the particular partial derivative along :
We think of as a measure for infinitesimal displacements along the -coordinate of a Cartesian space. If we have a measure of infintesimal displacement on some and a smooth function , then this induces a measure for infinitesimal displacement on by sending whatever happens there first with to and then applying the given measure there. This is captured by the following definition:
(pullback of differential 1-forms)
For a smooth function, the pullback of differential 1-forms along is the function
between sets of differential 1-forms, def. 9, which is defined on basis-elements by
and then extended linearly by
This is compatible with identity morphisms and composition in that
Stated more abstractly, this just means that pullback of differential 1-forms makes the assignment of sets of differential 1-forms to Cartesian spaces a contravariant functor
The following definition captures the idea that if is a measure for displacement along the -coordinate, and a measure for displacement along the coordinate, then there should be a way to get a measure, to be called , for infinitesimal surfaces (squares) in the --plane. And this should keep track of the orientation of these squares, with
being the same infinitesimal measure with orientation reversed.
(exterior algebra of differential n-forms)
For , the smooth differential forms on a Cartesian space (def. 1) is the exterior algebra
over the algebra of smooth functions (example 2) of the module of smooth 1-forms.
We write for the sub-module of degree and call its elements the differential n-forms.
Explicitly this means that a differential n-form on is a formal linear combination over (example 2) of basis elements of the form for :
Now all the constructions for differential 1-forms above extent naturally to differential n-forms:
(exterior derivative or de Rham differential)
For a Cartesian space (def. 1) the de Rham differential (5) uniquely extended as a derivation of degree +1 to the exterior algebra of differential forms (def. 11)
meaning that for then
In components this simply means that
Since partial derivatives commute with each other, while differential 1-form anti-commute, this implies that is nilpotent
We say hence that differential forms form a cochain complex, the de Rham complex .
(contraction of differential n-forms with tangent vector fields)
The pairing of tangent vector fields with differential 1-forms (4) uniquely extends to the exterior algebra of differential forms (def. 11) as a derivation of degree -1
In particular for two differential 1-forms, then
(pullback of differential n-forms)
For a smooth function between Cartesian spaces (def. 1) the operationf of pullback of differential 1-forms of def. 9 extends as an -algebra homomorphism to the exterior algebra of differential forms (def. 11),
given on basis elements by
This commutes with the de Rham differential on both sides (def. 12) in that
hence that pullback of differential forms is a chain map of de Rham complexes.
This is still compatible with identity morphisms and composition in that
Stated more abstractly, this just means that pullback of differential n-forms makes the assignment of sets of differential n-forms to Cartesian spaces a contravariant functor
Let be a Cartesian space (def. 1), and let be a smooth tangent vector field (example 5).
For write for the flow by diffeomorphisms along of parameter length .
Then the derivative with respect to of the pullback of differential forms along , hence the Lie derivative , is given by the anticommutator of the contraction derivation (def. 13) with the de Rham differential (def. 12):
Finally we turn to the concept of integration of differential forms (def. 15 below). First we need to say what it is that differential forms may be integrated over:
(smooth singular simplicial chains in Cartesian spaces)
For , the standard n-simplex in the Cartesian space (def. 1) is the subset
More generally, a smooth singular n-simplex in a Cartesian space is a smooth function (def. 1)
to be thought of as a smooth extension of its restriction
(This is called a singular simplex because there is no condition that be an embedding in any way, in particular may be a constant function.)
A singular chain in of dimension is a formal linear combination of singular -simplices in .
In particular, given a singular -simplex , then its boundary is a singular chain of singular -simplices .
(fiber-integration of differential forms) over smooth singular chains in Cartesian spaces)
For and a differential n-form (def. 11), which may be written as
then its integration over the standard n-simplex (def. 14) is the ordinary integral (e.g. Riemann integral)
More generally, for
a singular -chain (def. 14)
in any Cartesian space . Then the integration of over is the sum of the integrations, as above, of the pullback of differential forms (def. 2) along all the singular n-simplices in the chain:
Finally, for another Cartesian space, then fiber integration of differential forms along is the linear map
which on differential forms of the form is given by
(Stokes theorem for fiber-integration of differential forms)
For a smooth singular simplicial chain (def. 15) the operation of fiber-integration of differential forms along (def. 15) is compatible with the exterior derivative on (def. 12) in that
where is the de Rham differential on (def. 12) and where the second equality is the Stokes theorem along :
This concludes our review of the basics of (synthetic) differential geometry on which the following development of quantum field theory is based. In the next chapter we consider spacetime and spin.
Relativistic field theory takes place on spacetime.
The concept of spacetime makes sense for every dimension with . The observable universe has macroscopic dimension , but quantum field theory generally makes sense also in lower and in higher dimensions. For instance quantum field theory in dimension 0+1 is the “worldline” theory of particles, also known as quantum mechanics; while quantum field theory in dimension may be “KK-compactified” to an “effective” field theory in dimension which generally looks more complicated than its higher dimensional incarnation.
However, every realistic field theory, and also most of the non-realistic field theories of interest, contain spinor fields such as the Dirac field (example 43 below) and the precise nature and behaviour of spinors does depend sensitively on spacetime dimension. In fact the theory of relativistic spinors is mathematically most natural in just the following four spacetime dimensions:
In the literature one finds these four dimensions advertized for two superficially unrelated reasons:
in precisely these dimensions “GS-superstrings” exist (see there).
However, both these explanations have a common origin in something simpler and deeper: Spacetime in these dimensions appears from the “Pauli matrices” with entries in the real normed division algebras. (In fact it goes deeper still, but this will not concern us here.)
This we explain now, and then we use this to obtain a slick handle on spinors in these dimensions, using simple linear algebra over the four real normed division algebras. At the end (in remark 7) we give a dictionary that expresses these constructions in terms of the “two-component spinor notation” that is traditionally used in physics texts (remark 7 below).
The relation between real spin representations and division algebras, is originally due to Kugo-Townsend 82, Sudbery 84 and others. We follow the streamlined discussion in Baez-Huerta 09 and Baez-Huerta 10.
A key extra structure that the spinors impose on the underlying Cartesian space of spacetime is its causal structure, which determines which points in spacetime (“events”) are in the future or the past of other points (def. 29 below). This causal structure will turn out to tightly control the quantum field theory on spacetime in terms of the “causal additivity of the S-matrix” (prop. 91 below) and the induced “causal locality” of the algebra of quantum observables (prop. 94 below). To prepare the discussion of these constructions, we end this chapter with some basics on the causal structure of Minkowski spacetime.
Real division algebras
To amplify the following pattern and to fix our notation for algebra generators, recall these definitions:
The complex numbers is the commutative algebra over the real numbers which is generated from one generators subject to the relation
The quaternions is the associative algebra over the real numbers which is generated from three generators subject to the relations
for all
for a cyclic permutation of then
(graphics grabbed from Baez 02)
The octonions is the nonassociative algebra over the real numbers which is generated from seven generators subject to the relations
for all
for an edge or circle in the diagram shown (a labeled version of the Fano plane) then
and all relations obtained by cyclic permutation of the indices in these equations.
(graphics grabbed from Baez 02)
One defines the following operations on these real algebras:
(conjugation, real part, imaginary part and absolute value)
For , let
be the antihomomorphism of real algebras
given on the generators of def. 16, def. 17 and def. 18 by
This operation makes into a star algebra. For the complex numbers this is called complex conjugation, and in general we call it conjugation.
Let then
be the function
(“real part”) and
be the function
(“imaginary part”).
It follows that for all then the product of a with its conjugate is in the real center of
and we write the square root of this expression as
called the norm or absolute value function
This norm operation clearly satisfies the following properties (for all )
;
;
and hence makes a normed algebra.
Since is a division algebra, these relations immediately imply that each is a division algebra, in that
Hence the conjugation operation makes a real normed division algebra.
(sequence of inclusions of real normed division algebras)
Suitably embedding the sets of generators in def. 16, def. 17 and def. 18 into each other yields sequences of real star-algebra inclusions
For example for the first two inclusions we may send each generator to the generator of the same name, and for the last inclusion me may choose
(Hurwitz theorem: , , and are the normed real division algebras)
The four algebras of real numbers , complex numbers , quaternions and octonions from def. 16, def. 17 and def. 18 respectively, which are real normed division algebras via def. 19, are, up to isomorphism, the only real normed division algebras that exist.
(Cayley-Dickson construction and sedenions)
While prop. 5 says that the sequence from remark 1
is maximal in the category of real normed non-associative division algebras, there is a pattern that does continue if one disregards the division algebra property. Namely each step in this sequence is given by a construction called forming the Cayley-Dickson double algebra. This continues to an unbounded sequence of real nonassociative star-algebras
where the next algebra is called the sedenions.
What actually matters for the following relation of the real normed division algebras to real spin representations is that they are also alternative algebras:
Given any non-associative algebra , then the trilinear map
given on any elements by
is called the associator (in analogy with the commutator ).
If the associator is completely antisymmetric (in that for any permutation of three elements then for the signature of the permutation) then is called an alternative algebra.
If the characteristic of the ground field is different from 2, then alternativity is readily seen to be equivalent to the conditions that for all then
We record some basic properties of associators in alternative star-algebras that we need below:
(properties of alternative star algebras)
Let be an alternative algebra (def. 20) which is also a star algebra. Then (using def. 19):
the associator vanishes when at least one argument is real
the associator changes sign when one of its arguments is conjugated
the associator vanishes when one of its arguments is the conjugate of another
the associator is purely imaginary
That the associator vanishes as soon as one argument is real is just the linearity of an algebra product over the ground ring.
Hence in fact
This implies the second statement by linearity. And so follows the third statement by skew-symmetry:
The fourth statement finally follows by this computation:
Here the first equation follows by inspection and using that , the second follows from the first statement above, and the third is the anti-symmetry of the associator.
It is immediate to check that:
(, , and are real alternative algebras)
The real algebras of real numbers, complex numbers, def. 16,quaternions def. 17 and octonions def. 18 are alternative algebras (def. 20).
Since the real numbers, complex numbers and quaternions are associative algebras, their associator vanishes identically. It only remains to see that the associator of the octonions is skew-symmetric. By linearity it is sufficient to check this on generators. So let be a circle or a cyclic permutation of an edge in the Fano plane. Then by definition of the octonion multiplication we have
and similarly
The analog of the Hurwitz theorem (prop. 5) is now this:
(, , and are precisely the alternative real division algebras)
The only division algebras over the real numbers which are also alternative algebras (def. 20) are the real numbers themselves, the complex numbers, the quaternions and the octonions from prop. 7.
This is due to (Zorn 30).
For the following, the key point of alternative algebras is this equivalent characterization:
(alternative algebra detected on subalgebras spanned by any two elements)
A nonassociative algebra is alternative, def. 20, precisely if the subalgebra generated by any two elements is an associative algebra.
This is due to Emil Artin, see for instance (Schafer 95, p. 18).
Proposition 9 is what allows to carry over a minimum of linear algebra also to the octonions such as to yield a representation of the Clifford algebra on . This happens in the proof of prop. 15 below.
So we will be looking at a fragment of linear algebra over these four normed division algebras. To that end, fix the following notation and terminology:
(hermitian matrices with values in real normed division algebras)
Let be one of the four real normed division algebras from prop. 5, hence equivalently one of the four real alternative division algebras from prop. 8.
Say that an matrix with coefficients in
is a hermitian matrix if the transpose matrix equals the componentwise conjugated matrix (def. 19):
Hence with the notation
we have that is a hermitian matrix precisely if
We write for the real vector space of hermitian matrices.
(trace reversal)
Let be a hermitian matrix as in def. 21. Its trace reversal is the result of subtracting its trace times the identity matrix:
Minkowski spacetime in dimensions 3,4,6 and 10
We now discover Minkowski spacetime of dimension 3,4,6 and 10, in terms of the real normed division algebras from prop. 5, equivalently the real alternative division algebras from prop. 8: this is prop./def. 10 and def. 23 below.
(Minkowski spacetime as real vector space of hermitian matrices in real normed division algebras)
Let be one of the four real normed division algebras from prop. 5, hence one of the four real alternative division algebras from prop. 8.
Then the real vector space of hermitian matrices over (def. 21) equipped with the inner product whose quadratic form is the negative of the determinant operation on matrices is Minkowski spacetime:
hence
for ;
for ;
for ;
for .
Here we think of the vector space on the left as with
equipped with the canonical coordinates labeled .
As a linear map the identification is given by
This means that the quadratic form is given on an element by
By the polarization identity the quadratic form induces a bilinear form
given by
This is called the Minkowski metric.
Finally, under the above identification the operation of trace reversal from def. 22 corresponds to time reversal in that
We need to check that under the given identification, the Minkowski norm-square is indeed given by minus the determinant on the corresponding hermitian matrices. This follows from the nature of the conjugation operation from def. 19:
(physical units of length)
As the term “metric” suggests, in application to physics, the Minkowski metric in prop./def. 10 is regarded as a measure of length: for a tangent vector at a point in Minkowski spacetime, interpreted as a displacement from event to event , then
if then
is interpreted as a measure for the spatial distance between and ;
if then
is interpreted as a measure for the time distance between and .
But for this to make physical sense, an operational prescription needs to be specified that tells the experimentor how the real number is to be translated into an physical distance between actual events in the observable universe.
Such an operational prescription is called a physical unit of length. For example “centimeter” is a physical unit of length, another one is “femtometer” .
The combined information of a real number and a physical unit of length such as meter, jointly written
is a prescription for finding actual distance in the observable universe. Alternatively
is another prescription, that translates the same real number into another physical distance.
But of course they are related, since physical units form a torsor over the group of non-negative real numbers, meaning that any two are related by a unique rescaling. For example
with .
This means that once any one prescription of turning real numbers into spacetime distances is specified, then any other such prescription is obtained from this by rescaling these real numbers. For example
The point to notice here is that, via the last line, we may think of this as rescaling the metric from to .
In quantum field theory physical units of length are typically expressed in terms of a physical unit of “action”, called “Planck's constant” , via the combination of units called the Compton wavelength
parameterized, in turn, by a physical unit of mass . For the mass of the electron, the Compton wavelength is
Another physical unit of length parameterized by a mass is the Schwarzschild radius , where is the gravitational constant. Solving the equation
for yields the Planck mass
The corresponding Compton wavelength is given by the Planck length
(Minkowski spacetime as a pseudo-Riemannian Cartesian space)
Prop./def. 10 introduces Minkowski spacetime for as a a vector space equipped with a norm . The genuine spacetime corresponding to this is this vector space regaded as a Cartesian space, i.e. with smooth functions (instead of just linear maps) to it and from it (def. 1). This still carries one copy of over each point , as its tangent space (example 5)
and the Cartesian space equipped with the Lorentzian inner product from prop./def. 10 on each tangent space (a “pseudo-Riemannian Cartesian space”) is Minkowski spacetime as such.
We write
for the canonical volume form on Minkowski spacetime.
We use the Einstein summation convention: Expressions with repeated indices indicate summation over the range of indices.
For example a differential 1-form on Minkowski spacetime may be expanded as
Moreover we use square brackets around indices to indicate skew-symmetrization. For example a differential 2-form on Minkowski spacetime may be expanded as
The identification of Minkowski spacetime (def. 23) in the exceptional dimensions with the generalized Pauli matrices (prop./def. 10) has some immediate useful implications:
(Minkowski metric in terms of trace reversal)
In terms of the trace reversal operation from def. 22, the determinant operation on hermitian matrices (def. 21) has the following alternative expression
and the Minkowski inner product from prop. 10 has the alternative expression
(special linear group acts by linear isometries on Minkowski spacetime )
For one of the four real normed division algebras (prop. 5) the special linear group acts on Minkowski spacetime in dimension (def. 23) by linear isometries given under the identification with the Pauli matrices in prop./def. 10 by conjugation:
For this is immediate from matrix calculus, but we spell it out now. While the argument does not directly apply to the case of the octonions, one can check that it still goes through, too.
First we need to see that the action is well defined. This follows from the associativity of matrix multiplication and the fact that forming conjugate transpose matrices is an antihomomorphism: . In particular this implies that the action indeed sends hermitian matrices to hermitian matrices:
By prop./def. 10 such an action is an isometry precisely if it preserves the determinant. This follows from the multiplicative property of determinants: and their compativility with conjugate transposition: , and finally by the assumption that is an element of the special linear group, hence that its determinant is :
In fact the special linear groups of linear isometries in prop. 12 are the spin groups (def. 26 below) in these dimensions.
exceptional spinors and real normed division algebras
This we explain now.
Lorentz group and spin group
For , write
for the subgroup of the general linear group on those linear maps which preserve this bilinear form on Minkowski spacetime (def 23), in that
This is the Lorentz group in dimension .
The elements in the Lorentz group in the image of the special orthogonal group are rotations in space. The further elements in the special Lorentz group , which mathematically are “hyperbolic rotations” in a space-time plane, are called boosts in physics.
One distinguishes the following further subgroups of the Lorentz group :
is the subgroup of elements which have determinant +1 (as elements of the general linear group);
the proper orthochronous (or restricted) Lorentz group
is the further subgroup of elements which preserve the time orientation of vectors in that .
(connected component of Lorentz group)
As a smooth manifold, the Lorentz group (def. 24) has four connected components. The connected component of the identity is the proper orthochronous Lorentz group (def. 24). The other three components are
,
where, as matrices,
is the operation of point reflection at the origin in space, where
is the operation of reflection in time and hence where
is point reflection in spacetime.
The following concept of the Clifford algebra (def. 25) of Minkowski spacetime encodes the structure of the inner product space in terms of algebraic operation (“geometric algebra”), such that the action of the Lorentz group becomes represented by a conjugation action (example 7 below). In particular this means that every element of the proper orthochronous Lorentz group may be “split in half” to yield a double cover: the spin group (def. 26 below).
For , we write
for the -graded associative algebra over which is generated from generators in odd degree (“Clifford generators”), subject to the relation
where is the inner product of Minkowski spacetime as in def. 23.
These relations say equivalently that
We write
for the antisymmetrized product of Clifford generators. In particular, if all the are pairwise distinct, then this is simply the plain product of generators
Finally, write
for the algebra anti-automorphism given by
(vectors inside Clifford algebra)
By construction, the vector space of linear combinations of the generators in a Clifford algebra (def. 25) is canonically identified with Minkowski spacetime (def. 23)
via
hence via
such that the defining quadratic form on is identified with the anti-commutator in the Clifford algebra
where on the right we are, in turn, identifying with the linear span of the unit in .
The key point of the Clifford algebra (def. 25) is that it realizes spacetime reflections, rotations and boosts via conjugation actions:
(Clifford conjugation)
For and the Minkowski spacetime of def. 23, let be any vector, regarded as an element via remark 4.
Then
the conjugation action of a single Clifford generator on sends to its reflection at the hyperplane ;
sends to the result of rotating it in the -plane through an angle .
This is immediate by inspection:
For the first statement, observe that conjugating the Clifford generator with yields up to a sign, depending on whether or not:
Therefore for then is the result of multiplying the -component of by .
For the second statement, observe that
This is the canonical action of the Lorentzian special orthogonal Lie algebra . Hence
is the rotation action as claimed.
Since the reflections, rotations and boosts in example 7 are given by conjugation actions, there is a crucial ambiguity in the Clifford elements that induce them:
the conjugation action by coincides precisely with the conjugation action by ;
the conjugation action by coincides precisely with the conjugation action by .
For , the spin group is the group of even graded elements of the Clifford algebra (def. 25) which are unitary with respect to the conjugation operation from def. 25:
The function
from the spin group (def. 26) to the general linear group in -dimensions given by sending to the conjugation action
(via the identification of Minkowski spacetime as the subspace of the Clifford algebra containing the linear combinations of the generators, according to remark 4)
is
a group homomorphism onto the proper orthochronous Lorentz group (def. 24):
exhibiting a -central extension.
That the function is a group homomorphism into the general linear group, hence that it acts by linear transformations on the generators follows by using that it clearly lands in automorphisms of the Clifford algebra.
That the function lands in the Lorentz group follows from remark 4:
That it moreover lands in the proper Lorentz group follows from observing (example 7) that every reflection is given by the conjugation action by a linear combination of generators, which are excluded from the group (as that is defined to be in the even subalgebra).
To see that the homomorphism is surjective, use that all elements of are products of rotations in hyperplanes. If a hyperplane is spanned by the bivector , then such a rotation is given, via example 7 by the conjugation action by
for some , hence is in the image.
That the kernel is is clear from the fact that the only even Clifford elements which commute with all vectors are the multiples of the identity. For these and hence the condition is equivalent to . It is clear that these two elements are in the center of . This kernel reflects the ambiguity from remark 5.
Spinors in dimensions 3, 4, 6 and 10
We now discuss how real spin representations (def. 26) in spacetime dimensions 3,4, 6 and 10 are naturally induced from linear algebra over the four real alternative division algebras (prop. 5).
(Clifford algebra via normed division algebra)
Let be one of the four real normed division algebras from prop. 5, hence one of the four real alternative division algebras from prop. 8.
Define a real linear map
from (the real vector space underlying) Minkowski spacetime to real linear maps on
Here on the right we are using the isomorphism from prop. 10 for identifying a spacetime vector with a -matrix, and we are using the trace reversal from def. 22.
(Clifford multiplication via octonion-valued matrices)
Each operation of in def. 27 is clearly a linear map, even for being the non-associative octonions. The only point to beware of is that for the octonions, then the composition of two such linear maps is not in general given by the usual matrix product.
(real spin representations via normed division algebras)
The map in def. 27 gives a representation of the Clifford algebra (this def.), i.e of
for ;
for ;
for ;
for .
Hence this Clifford representation induces representations of the spin group on the real vector spaces
and hence on
We need to check that the Clifford relation
is satisfied (where we used (11) and (8)). Now by definition, for any then
where on the right we have in each component ordinary matrix product expressions.
Now observe that both expressions on the right are sums of triple products that involve either one real factor or two factors that are conjugate to each other:
Since the associators of triple products that involve a real factor and those involving both an element and its conjugate vanish by prop. 6 (hence ultimately by Artin’s theorem, prop. 9). In conclusion all associators involved vanish, so that we may rebracket to obtain
This implies the statement via the equality (prop. 11).
Let be one of the four real normed division algebras and the corresponding spin representation from prop. 15.
Then there are bilinear maps from two spinors (according to prop. 15) to the real numbers
as well as to
given, respectively, by forming the real part (def. 19) of the canonical -inner product
and by forming the product of a column vector with a row vector to produce a matrix, possibly up to trace reversal (def. 22) under the identification from prop. 10:
and
For the -component of this map is
These pairings have the following properties
both are -equivalent;
the pairing is symmetric:
(Baez-Huerta 09, prop. 8, prop. 9).
(two-component spinor notation)
In the physics/QFT literature the expressions for spin representations given by prop. 15 are traditionally written in two-component spinor notation as follows:
An element of is denoted and called a left handed spinor;
an element of is denoted and called a right handed spinor;
an element of is denoted
and called a Dirac spinor;
and the Clifford action of prop. 27 corresponds to the generalized “Pauli matrices”:
a hermitian matrix as in prop 10 regarded as a linear map via def. 27 is denoted
the negative of the trace-reversal (def. 22) of such a hermitian matrix, regarded as a linear map , is denoted
the corresponding Clifford generator (def. 27) is denoted
the bilinear spinor-to-vector pairing from prop. 16 is written as the matrix multiplication
where the Dirac conjugate on the left is given on by
hence, with (13):
Finally, it is common to abbreviate contractions with the Clifford algebra generators by a slash, as in
or
This is called the Feynman slash notation.
(e.g. Dermisek I-8, Dermisek I-9)
Below we spell out the example of the Lagrangian field theory of the Dirac field in detail (example 43). For discussion of massive chiral spinor fields one also needs the following, here we just mention this for completeness:
(chiral spinor mass pairing)
In dimension 2+1 and 3+1, there exists a non-trivial skew-symmetric pairing
which may be normalized such that in the two-component spinor basis of remark 7 we have
Take the non-vanishing components of to be
and
With this equation (17) is checked explicitly. It is clear that thus defined is skew symmetric as long as the component algebra is commutative, which is the case for being or .
Causal structure
We need to consider the following concepts and constructions related to the causal structure of Minkowski spacetime (def. 23).
(spacelike, timelike, lightlike directions; past and future)
Given two points in Minkowski spacetime (def. 23), write
for their difference, using the vector space structure underlying Minkowski spacetime.
Recall the Minkowski inner product on , given by prop./def. 10. Then via remark 3 we say that the difference vector is
For a point in spacetime (an event), we write
for the subsets of events that are in the timelike future or in the timelike past of , respectively (def. 29) called the open future cone and open past cone, respectively, and
for the subsets of events that are in the timelike or lightlike future or past, respectivel, called the closed future cone and closed past cone, respectively.
The union
of the closed future cone and past cone is called the full causal cone of the event . Its boundary is the light cone.
More generally for a subset of events we write
for the union of the future/past closed cones of all events in the subset.
(compactly sourced causal support)
Consider a vector bundle (def. 7) over Minkowski spacetime (def. 23).
Write for the spaces of smooth sections (def. 5), and write
for the subsets on those smooth sections whose support is
() inside a compact subset,
() inside the closed future cone/closed past cone, respectively, of a compact subset,
() inside the closed causal cone of a compact subset, which equivalently means that the intersection with every (spacelike) Cauchy surface is compact (Sanders 13, theorem 2.2),
() inside the past of a Cauchy surface (Sanders 13, def. 3.2),
() inside the future of a Cauchy surface (Sanders 13, def. 3.2),
() inside the future of one Cauchy surface and the past of another (Sanders 13, def. 3.2).
(Bär 14, section 1, Khavkine 14, def. 2.1)
(causal order)
Consider the relation on the set of subsets of spacetime which says a subset is not prior to a subset , denoted , if does not intersect the causal past of (def. 30), or equivalently that does not intersect the causal future of :
If and we say that the two subsets are spacelike separated.
(causal complement and causal closure of subset of spacetime)
For a subset of spacetime, its causal complement is the complement of the causal cone:
The causal complement of the causal complement is called the causal closure. If
then the subset is called a causally closed subset.
Given a spacetime , we write
for the partially ordered set of causally closed subsets, partially ordered by inclusion .
For a causally closed subset of spacetime (def. 8) say that an adiabatic switching function or infrared cutoff function for is a smooth function of compact support (a bump function) whose restriction to some neighbourhood of is the constant function with value :
Often we consider the vector space space spanned by a formal variable (the coupling constant) under multiplication with smooth functions, and consider as adiabatic switching functions the corresponding images in this space,
which are thus bump functions constant over a neighbourhood of not on 1 but on the formal parameter :
In this sense we may think of the adiabatic switching as being the spacetime-depependent coupling “constant”.
The following lemma 1 will be key in the derivation (proof of prop. 92 below) of the causal locality of algebra of quantum observables in perturbative quantum field theory:
(causal partition)
Let be a causally closed subset (def. 8) and let be a compactly supported smooth function which vanishes on a neighbourhood , i.e. .
Then there exists a causal partition of in that there exist compactly supported smooth functions such that
By assumption has a Cauchy surface. This may be extended to a Cauchy surface of , such that this is one leaf of a foliation of by Cauchy surfaces, given by a diffeomorphism with the original at zero. There exists then such that the restriction of to the interval is in the causal complement of the given region (def. 8):
Let then be any smooth function with
.
Then
are smooth functions as required.
This concludes our discussion of spin and spacetime. In the next chapter we consider the concept of fields on spacetime.
A field history on a given spacetime (a history of spatial field configurations, see remark 8 below) is a quantity assigned to each point of spacetime (each event), such that this assignment varies smoothly with spacetime points. For instance an electromagnetic field history (example 11 below) is at each point of spacetime a collection of vectors that encode the direction in which a charged particle passing through that point would feel a force (the “Lorentz force”, see example 11 below).
This is readily formalized (def. 34 below): If denotes the smooth manifold of “values” that the given kind of field may take at any spacetime point, then a field history is modeled as a smooth function from spacetime to this space of values:
It will be useful to unify spacetime and the space of field values (the field fiber) into a single manifold, the Cartesian product
and to think of this equipped with the projection map onto the first factor as a fiber bundle of spaces of field values over spacetime
This is then called the field bundle, which specifies the kind of values that the given field species may take at any point of spacetime. Since the space of field values is the fiber of this fiber bundle (def. 6), it is sometimes also called the field fiber. (See also at fiber bundles in physics.)
Given a field bundle , then a field history is a section of that bundle (def. 5). The discussion of field theory concerns the space of all possible field histories, hence the space of sections of the field bundle (example 16 below). This is a very “large” generalized smooth space, called a diffeological space (def. 35 below).
Or rather, in the presence of fermion fields such as the Dirac field (example 35 below), the Pauli exclusion principle demands that the field bundle is a super-manifold, and that the fermionic space of field histories (example 53 below) is a super-geometric generalized smooth space: a super smooth set (def. 48 below).
This smooth structure on the space of field histories will be crucial when we discuss observables of a field theory below, because these are smooth functions on the space of field histories. In particular it is this smooth structure which allows to derive that linear observables of a free field theory are given by distributions (prop. 37) below. Among these are the point evaluation observables (delta distributions) which are traditionally denoted by the same symbol as the field histories themselves.
Hence there are these aspects of the concept of “field” in physics, which are closely related, but crucially different:
aspects of the concept of fields
| aspect | term | type | description | def. |
|---|---|---|---|---|
| field component | , | coordinate function on jet bundle of field bundle | def. 34, def. 54 | |
| field history | , | jet prolongation of section of field bundle | def. 34, def. 55 | |
| field observable | , | derivatives of delta-functional on space of sections | def. 71, example 60 | |
| averaging of field observable | observable-valued distribution | def. 80 | ||
| algebra of quantum observables | non-commutative algebra structure on field observables | def. 127, def. 132 |
We now discuss these topics:
(fields and field histories)
Given a spacetime , then a type of fields on is a smooth fiber bundle (def. 6)
called the field bundle,
Given a type of fields on this way, then a field history of that type on is a term of that type, hence is a smooth section (def. 5) of this bundle, namely a smooth function of the form
such that composed with the projection map it is the identity function, i.e. such that
The set of such sections/field histories is to be denoted
(field histories are histories of spatial field configurations)
Given a section of the field bundle (def. 34) and given a spacelike (def. 29) submanifold (def. 44) of spacetime in codimension 1, then the restriction of to may be thought of as a field configuration in space. As different spatial slices are chosen, one obtains such field configurations at different times. It is in this sense that the entirety of a section is a history of field configurations, hence a field history (def 34).
(possible field histories)
After we give the set of field histories (18) differential geometric structure, below in example 16 and example 33, we call it the space of field histories. This should be read as space of possible field histories; containing all field histories that qualify as being of the type specified by the field bundle .
After we obtain equations of motion below in def. 61, these serve as the “laws of nature” that field histories should obey, and they define the subspace of those field histories that do solve the equations of motion; this will be denoted
and called the on-shell space of field histories (41).
For the time being, not to get distracted from the basic idea of quantum field theory, we will focus on the following simple special case of field bundles:
(trivial vector bundle as a field bundle)
In applications the field fiber is often a finite dimensional vector space. In this case the trivial field bundle with fiber is of course a trivial vector bundle (def. 7).
Choosing any linear basis of the field fiber, then over Minkowski spacetime (def. 23) we have canonical coordinates on the total space of the field bundle
where the index ranges from to , while the index ranges from 1 to .
If this trivial vector bundle is regarded as a field bundle according to def. 34, then a field history is equivalently an -tuple of real-valued smooth functions on spacetime:
(field bundle for real scalar field)
If is a spacetime and if the field fiber
is simply the real line, then the corresponding trivial field bundle (def. 34)
is the trivial real line bundle (a special case of example 9) and the corresponding field type (def. 34) is called the real scalar field on . A configuration of this field is simply a smooth function on with values in the real numbers:
(field bundle for electromagnetic field)
On Minkowski spacetime (def. 23), let the field bundle (def. 34) be given by the cotangent bundle
This is a trivial vector bundle (example 9) with canonical field coordinates .
A section of this bundle, hence a field history, is a differential 1-form
on spacetime (def. 9). Interpreted as a field history of the electromagnetic field on , this is often called the vector potential. Then the de Rham differential (def. 12) of the vector potential is a differential 2-form
known as the Faraday tensor. In the canonical coordinate basis 2-forms this may be expanded as
Here are called the components of the electric field, while are called the components of the magnetic field.
(field bundle for Yang-Mills field over Minkowski spacetime)
Let be a Lie algebra of finite dimension with linear basis , in terms of which the Lie bracket is given by
Over Minkowski spacetime (def. 23), consider then the field bundle which is the cotangent bundle tensored with the Lie algebra
This is the trivial vector bundle (example 9) with induced field coordinates
A section of this bundle is a Lie algebra-valued differential 1-form
with components
This is called a field history for Yang-Mills gauge theory (at least if is a semisimple Lie algebra, see example 41 below).
For is the line Lie algebra, this reduces to the case of the electromagnetic field (example 11).
For this is a field history for the gauge field of the strong nuclear force in quantum chromodynamics.
For readers familiar with the concepts of principal bundles and connections on a bundle we include the following example 13 which generalizes the Yang-Mills field over Minkowski spacetime from example 12 to the situation over general spacetimes.
(general Yang-Mills field in fixed topological sector)
Let be any spacetime manifold and let be a compact Lie group with Lie algebra denoted . Let be a -principal bundle and a chosen connection on it, to be called the background -Yang-Mills field.
Then the field bundle (def. 34) for -Yang-Mills theory in the topological sector is the tensor product of vector bundles
of the adjoint bundle of and the cotangent bundle of .
With the choice of , every (other) connection on uniquely decomposes as
where
is a section of the above field bundle, hence a Yang-Mills field.
The electromagnetic field (def. 11) and the Yang-Mills field (def. 12, def. 13) with differential 1-forms as field histories are the basic examples of gauge fields (we consider this in more detail below in Gauge symmetries). There are also higher gauge fields with differential n-forms as field histories:
(field bundle for B-field)
On Minkowski spacetime (def. 23), let the field bundle (def. 34) be given by the skew-symmetrized tensor product of vector bundles of the cotangent bundle with itself
This is a trivial vector bundle (example 9) with canonical field coordinates subject to
A section of this bundle, hence a field history, is a differential 2-form (def. 11)
on spacetime.
Given any field bundle, we will eventually need to regard the set of all field histories as a “smooth set” itself, a smooth space of sections, to which constructions of differential geometry apply (such as for the discussion of observables and states below ). Notably we need to be talking about differential forms on .
However, a space of sections does not in general carry the structure of a smooth manifold; and it carries the correct smooth structure of an infinite dimensional manifold only if is a compact space (see at manifold structure of mapping spaces). Even if it does carry infinite dimensional manifold structure, inspection shows that this is more structure than actually needed for the discussion of field theory. Namely it turns out below that all we need to know is what counts as a smooth family of sections/field histories, hence which functions of sets
from any Cartesian space (def. 1) into count as smooth functions, subject to some basic consistency condition on this choice.
This structure on is called the structure of a diffeological space:
A diffeological space is
for each a choice of subset
of the set of functions from the underlying set of to , to be called the smooth functions or plots from to ;
for each smooth function between Cartesian spaces (def. 1) a choice of function
to be thought of as the precomposition operation
such that
(constant functions are smooth)
If is the identity function on , then is the identity function on the set of plots ;
If are two composable smooth functions between Cartesian spaces (def. 1), then pullback of plots along them consecutively equals the pullback along the composition:
i.e.
(gluing)
If is a differentiably good open cover of a Cartesian space (def. 3) then the function which restricts -plots of to a set of -plots
is a bijection onto the set of those tuples of plots, which are “matching families” in that they agree on intersections:
Finally, given and two diffeological spaces, then a smooth function between them
is
a function of the underlying sets
such that
for a plot of , then the composition is a plot of :
(Stated more abstractly, this says simply that diffeological spaces are the concrete sheaves on the site of Cartesian spaces from def. 3.)
For more background on diffeological spaces see also geometry of physics -- smooth sets.
(Cartesian spaces are diffeological spaces)
Let be a Cartesian space (def. 1) Then it becomes a diffeological space (def. 35) by declaring its plots to the ordinary smooth functions .
Under this identification, a function between the underlying sets of two Cartesian spaces is a smooth function in the ordinary sense precisely if it is a smooth function in the sense of diffeological spaces.
Stated more abstractly, this statement is an example of the Yoneda embedding over a subcanonical site.
More generally, the same construction makes every smooth manifold a smooth set.
(diffeological space of field histories)
Let be a smooth field bundle (def. 34). Then the set of field histories/sections (def. 34) becomes a diffeological space (def. 35)
by declaring that a smooth family of field histories, parameterized over any Cartesian space is a smooth function out of the Cartesian product manifold of with
such that for each we have , i.e.
The following example 17 is included only for readers who wonder how infinite-dimensional manifolds fit in. Since we will never actually use infinite-dimensional manifold-structure, this example is may be ignored.
(Fréchet manifolds are diffeological spaces)
Consider the particular type of infinite-dimensional manifolds called Fréchet manifolds. Since ordinary smooth manifolds are an example, for a Fréchet manifold there is a concept of smooth functions . Hence we may give the structure of a diffeological space (def. 35) by declaring the plots over to be these smooth functions , with the evident postcomposition action.
It turns out that then that for and two Fréchet manifolds, there is a natural bijection between the smooth functions between them regarded as Fréchet manifolds and [regarded as diffeological spaces. Hence it does not matter which of the two perspectives we take (unless of course a diffeological space more general than a Fréchet manifolds enters the picture, at which point the second definition generalizes, whereas the first does not).]
Stated more abstractly, this means that Fréchet manifolds form a full subcategory of that of diffeological spaces (this prop.):
If is a compact smooth manifold and is a trivial fiber bundle with fiber a smooth manifold, then the set of sections carries a standard structure of a Fréchet manifold (see at manifold structure of mapping spaces). Under the above inclusion of Fréchet manifolds into diffeological spaces, this smooth structure agrees with that from example 16 (see this prop.)
Once the step from smooth manifolds to diffeological spaces (def. 35) is made, characterizing the smooth structure of the space entirely by how we may probe it by mapping smooth Cartesian spaces into it, it becomes clear that the underlying set of a diffeological space is not actually crucial to support the concept: The space is already entirely defined structurally by the system of smooth plots it has, and the underlying set is recovered from these as the set of plots from the point .
This is crucial for field theory: the spaces of field histories of fermionic fields (def. 50 below) such as the Dirac field (example 53 below) do not have underlying sets of points the way diffeological spaces have. Informally, the reason is that a point is a bosonic object, while and the nature of fermionic fields is the opposite of bosonic.
But we may just as well drop the mentioning of the underlying set in the definition of generalized smooth spaces. By simply stripping this requirement off of def. 35 we obtain the following more general and more useful definition (still “bosonic”, though, the supergeometric version is def. 48 below):
A smooth set is
for each a choice of set
to be called the set of smooth functions or plots from to ;
for each smooth function between Cartesian spaces a choice of function
to be thought of as the precomposition operation
such that
If is the identity function on , then is the identity function on the set of plots .
If are two composable smooth functions between Cartesian spaces, then consecutive pullback of plots along them equals the pullback along the composition:
i.e.
(gluing)
If is a differentiably good open cover of a Cartesian space (def. 3) then the function which restricts -plots of to a set of -plots
is a bijection onto the set of those tuples of plots, which are “matching families” in that they agree on intersections:
Finally, given and two smooth sets, then a smooth function between them
is
for each a function
such that
for each smooth function between Cartesian spaces we have
Stated more abstractly, this simply says that smooth sets are the sheaves on the site of Cartesian spaces from def. 3.
Basing differential geometry on smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- smooth sets.
First we verify that the concept of smooth sets is a consistent generalization:
(diffeological spaces are smooth sets)
Every diffeological space (def. 35) is a smooth set (def. 36) simply by forgetting its underlying set of points and remembering only its sets of plot.
In particular therefore each Cartesian space is canonically a smooth set by example 15.
Moreover, given any two diffeological spaces, then the morphisms between them, regarded as diffeological spaces, are the same as the morphisms as smooth sets.
Stated more abstractly, this means that we have full subcategory inclusions
Recall, for the next proposition 17, that in the definition 36 of a smooth set the sets are abstract sets which are to be thought of as would-be smooth functions “”. Inside def. 36 this only makes sense in quotation marks, since inside that definition the smooth set is only being defined, so that inside that definition there is not yet an actual concept of smooth functions of the form “”.
But now that the definition of smooth sets and of morphisms between them has been stated, and seeing that Cartesian space are examples of smooth sets, by example 18, there is now an actual concept of smooth functions , namely as smooth sets. For the concept of smooth sets to be consistent, it ought to be true that this a posteriori concept of smooth functions from Cartesian spaces to smooth sets coincides wth the a priori concept, hence that we “may remove the quotation marks” in the above. The following proposition says that this is indeed the case:
(plots of a smooth set really are the smooth functions into the smooth set)
Let be a smooth set (def. 36). For , there is a natural function
from the set of homomorphisms of smooth sets from (regarded as a smooth set via example 18) to , to the set of plots of over , given by evaluating on the identity plot .
This function is a bijection.
This says that the plots of , which initially bootstrap into being as declaring the would-be smooth functions into , end up being the actual smooth functions into .
This elementary but profound fact is called the Yoneda lemma, here in its incarnation over the site of Cartesian spaces (def. 1).
A key class of examples of smooth sets (def. 36) that are not diffeological spaces (def. 35) are universal smooth moduli spaces of differential forms:
(universal smooth moduli spaces of differential forms)
For there is a smooth set (def. 36)
defined as follows:
for the set of plots from to is the set of smooth differential k-forms on (def. 11)
for a smooth function (def. 1) the operation of fullback of plots along is just the pullback of differential forms from prop. 2
That this is functorial is just the standard fact (7) from prop. 2.
For the smooth set actually is a diffeological space, in fact under the identification of example 18 this is just the real line:
But for we have that the set of plots on is a singleton
consisting just of the zero differential form. The only diffeological space with this property is itself. But is far from being that trivial: even though its would-be underlying set is a single point, for all it admits an infinite set of plots. Therefore the smooth sets for are not diffeological spaces.
That the smooth set indeed deserves to be addressed as the universal moduli space of differential k-forms follows from prop. 17: The universal moduli space of -forms ought to carry a universal differential -forms such that every differential -form on any arises as the pullback of differential forms of this universal one along some modulating morphism :
But with prop. 17 this is precisely what the definition of the plots of says.
Similarly, all the usual operations on differential form now have their universal archetype on the universal moduli spaces of differential forms
In particular, for there is a canonical morphism of smooth sets of the form
defined over by the ordinary de Rham differential (def. 12)
That this satisfies the compatibility with precomposition of plots
is just the compatibility of pullback of differential forms with the de Rham differential of from prop. 2.
The upshot is that we now have a good definition of differential forms on any diffeological space and more generally on any smooth set:
(differential forms on smooth sets)
Let be a diffeological space (def. 35) or more generally a smooth set (def. 36) then a differential k-form on is equivalently a morphism of smooth sets
from to the universal smooth moduli space of differential froms from example 19.
Concretely, by unwinding the definitions of and of morphisms of smooth sets, this means that such a differential form is:
for each and each plot an ordinary differential form
such that
for each smooth function between Cartesian spaces the ordinary pullback of differential forms along is compatible with these choices, in that for every plot we have
i.e.
We write for the set of differential forms on the smooth set defined this way.
Moreover, given a differential k-form
on a smooth set this way, then its de Rham differential is given by the composite of morphisms of smooth sets with the universal de Rham differential from (23):
Explicitly this means simply that for a plot, then
The usual operations on ordinary differential forms directly generalize plot-wise to differential forms on diffeological spaces and more generally on smooth sets:
(exterior differential and exterior product on smooth sets)
Let be a diffeological space (def. 35) or more generally a smooth set (def. 36). Then
For a differential form on (def. 37) its exterior differential
is defined on any plot as the ordinary exterior differential of the pullback of along that plot:
For and two differential forms on (def. 37) then their exterior product
is the differential form defined on any plot as the ordinary exterior product of the pullback of th differential forms and to this plot:
Infinitesimal geometry
It is crucial in field theory that we consider field histories not only over all of spacetime, but also restricted to submanifolds of spacetime. Or rather, what is actually of interest are the restrictions of the field histories to the infinitesimal neighbourhoods (example 27 below) of these submanifolds. This appears notably in the construction of phase spaces below. Moreover, fermion fields such as the Dirac field (example 35 below) take values in graded infinitesimal spaces, called super spaces (discussed below). Therefore “infinitesimal geometry”, sometimes called formal geometry (as in “formal scheme”) or synthetic differential geometry or synthetic differential supergeometry, is a central aspect of field theory.
In order to mathematically grasp what infinitesimal neighbourhoods are, we appeal to the first magic algebraic property of differential geometry from prop. 1, which says that we may recognize smooth manifolds dually in terms of their commutative algebras of smooth functions on them
But since there are of course more algebras than arise this way from smooth manifolds, we may turn this around and try to regard any algebra as defining a would-be space, which would have as its algebra of functions.
For example an infinitesimally thickened point should be a space which is “so small” that every smooth function on it which vanishes at the origin takes values so tiny that some finite power of them is not just even more tiny, but actually vanishes:
(infinitesimally thickened Cartesian space)
An infinitesimally thickened point
is represented by a commutative algebra which as a real vector space is a direct sum
of the 1-dimensional space of multiples of 1 with a finite dimensional vector space that is a nilpotent ideal in that for each element there exists a natural number such that
More generally, an infinitesimally thickened Cartesian space
is represented by a commutative algebra
which is the tensor product of algebras of the algebra of smooth functions on an actual Cartesian space of some dimension (example 2), with an algebra of functions of an infinitesimally thickened point, as above.
We say that a smooth function between two infinitesimally thickened Cartesian spaces
is by definition dually an -algebra homomorphism of the form
(infinitesimal neighbourhoods in the real line )
Consider the quotient algebra of the formal power series algebra in a single parameter by the ideal generated by :
(This is sometimes called the algebra of dual numbers, for no good reason.) The underlying real vector space of this algebra is, as show, the direct sum of the multiples of 1 with the multiples of . A general element in this algebra is of the form
where are real numbers. The product in this algebra is given by “multiplying out” as usual, and discarding all terms proportional to :
We may think of an element as the truncation to first order of a Taylor series at the origin of a smooth function on the real line
where is the value of the function at the origin, and where is its first derivative at the origin.
Therefore this algebra behaves like the algebra of smooth function on an infinitesimal neighbourhood of which is so tiny that its elements become, upon squaring them, not just tinier, but actually zero:
This intuitive picture is now made precise by the concept of infinitesimally thickened points def. 39, if we simply set
and observe that there is the inclusion of infinitesimally thickened Cartesian spaces
which is dually given by the algebra homomorphism
which sends a smooth function to its value at zero plus times its derivative at zero. Observe that this is indeed a homomorphism of algebras due to the product law of differentiation, which says that
Hence we see that restricting a smooth function to the infinitesimal neighbourhood of a point is equivalent to restricting attention to its [[Taylor series|] to the given order at that point:
Similarly for each the algebra
may be thought of as the algebra of Taylor series at the origin of of smooth functions , where all terms of order higher than are discarded. The corresponding infinitesimally thickened point is often denoted
This is now the subobject of the real line
on those elements such that .
The following example 21 shows that infinitesimal thickening is invisible for ordinary spaces when mapping out of these. In contrast example 22 further below shows that the morphisms into an ordinary space out of an infinitesimal space are interesting: these are tangent vectors and their higher order infinitesimal analogs.
(infinitesimal line has unique global point)
For any ordinary Cartesian space (def. 1) and the order- infinitesimal neighbourhood of the origin in the real line from example 20, there is exactly only one possible morphism of infinitesimally thickened Cartesian spaces from to :
By definition such a morphism is dually an algebra homomorphism
from the higher order “algebra of dual numbers” to the algebra of smooth functions (example 2).
Now this being an -algebra homomorphism, its action on the multiples of the identity is fixed:
All the remaining elements are proportional to , and hence are nilpotent. However, by the homomorphism property of an algebra homomorphism it follows that it must send nilpotent elements to nilpotent elements , because
But the only nilpotent element in is the zero element, and hence it follows that
Thus as above is uniquely fixed.
(synthetic tangent vector fields)
Let be a Cartesian space (def. 1), regarded as an infinitesimally thickened Cartesian space (def. 39) and consider the first order infinitesimal line from example 20.
Then homomorphisms of infinitesimally thickened Cartesian spaces of the form
hence smoothly -parameterized collections of morphisms
which send the unique base point (example 21) to , are in natural bijection with tangent vector fields (example 5).
By definition, the morphisms in question are dually -algebra homomorphisms of the form
which are the identity modulo . Such a morphism has to take any function to
for some smooth function . The condition that this assignment makes an algebra homomorphism is equivalent to the statement that for all we have
Multiplying this out and using that , this is equivalent to
This in turn means equivalently that is a derivation.
With this the statement follows with the third magic algebraic property of smooth functions (prop. 1): derivations of smooth functions are vector fields.
We need to consider infinitesimally thickened spaces more general than the thickenings of just Cartesian spaces in def. 39. But just as Cartesian spaces (def. 1) serve as the local test geometries to induce the general concept of diffeological spaces and smooth sets (def. 36), so using infinitesimally thickened Cartesian spaces as test geometries immediately induces the corresponding generalization of smooth sets with infinitesimals:
A formal smooth set is
for each infinitesimally thickened Cartesian space (def. 39) a set
to be called the set of smooth functions or plots from to ;
for each smooth function between infinitesimally thickened Cartesian spaces a choice of function
to be thought of as the precomposition operation
such that
If is the identity function on , then is the identity function on the set of plots ;
If are two composable smooth functions between infinitesimally thickened Cartesian spaces, then pullback of plots along them consecutively equals the pullback along the composition:
i.e.
(gluing)
If is such that
in a differentiably good open cover (def. 3) then the function which restricts -plots of to a set of -plots
is a bijection onto the set of those tuples of plots, which are “matching families” in that they agree on intersections:
i.e.
Finally, given and two formal smooth sets, then a smooth function between them
is
for each infinitesimally thickened Cartesian space (def. 39) a function
such that
for each smooth function between infinitesimally thickened Cartesian spaces we have
i.e.
(Dubuc 79)
Basing infinitesimal geometry on formal smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- manifolds and orbifolds.
We have the evident generalization of example 15 to smooth geometry with infinitesimals:
(infinitesimally thickened Cartesian spaces are formal smooth sets)
For an infinitesimally thickened Cartesian space (def. 39), it becomes a formal smooth set according to def. 40 by taking its plots out of some to be the homomorphism of infinitesimally thickened Cartesian spaces:
(Stated more abstractly, this is an instance of the Yoneda embedding over a subcanonical site.)
(smooth sets are formal smooth sets)
Let be a smooth set (def. 36). Then becomes a formal smooth set (def. 40) by declaring the set of plots over an infinitesimally thickened Cartesian space (def. 39) to be equivalence classes of pairs
of a morphism of infinitesimally thickened Cartesian spaces and of a plot of , as shown, subject to the equivalence relation which identifies two such pairs if there exists a smooth function such that
Stated more abstractly this says that as a formal smooth set is the left Kan extension (see this example) of as a smooth set along the functor that includes Cartesian spaces (def. 1) into infinitesimally thickened Cartesian spaces (def. 39).
(reduction and infinitesimal shape)
For an infinitesimally thickened Cartesian space (def. 39) we say that the underlying ordinary Cartesian space (def. 1) is its reduction
There is the canonical inclusion morphism
which dually corresponds to the homomorphism of commutative algebras
which is the identity on all smooth functions and is zero on all elements in the nilpotent ideal of (as in example 21).
Given any formal smooth set , we say that its infinitesimal shape or de Rham shape (also: de Rham stack) is the formal smooth set (def. 40) defined to have as plots the reductions of the plots of , according to the above:
There is a canonical morphism of formal smooth set
which takes a plot
to the composition
regarded as a plot of .
(mapping space out of an infinitesimally thickened Cartesian space)
Let be an infinitesimally thickened Cartesian space (def. 39) and let be a formal smooth set (def. 40). Then the mapping space
of smooth functions from to is the formal smooth set whose -plots are the morphisms of formal smooth sets from the Cartesian product of infinitesimally thickened Cartesian spaces to , hence the -plots of :
Let be a Cartesian space (def. 1) regarded as an infinitesimally thickened Cartesian space (39) and thus regarded as a formal smooth set (def. 40) by example 23. Consider the infinitesimal line
from example 20. Then the mapping space (example 25) is the total space of the tangent bundle (example 5). Moreover, under restriction along the reduction , this is the full tangent bundle projection, in that there is a natural isomorphism of formal smooth sets of the form
In particular this implies immediately that smooth sections (def. 5) of the tangent bundle
are equivalently morphisms of the form
which we had already identified with tangent vector fields (def. 5) in example 22.
This follows by an analogous argument as in example 22, using the Hadamard lemma.
While in infinitesimally thickened Cartesian spaces (def. 39) only infinitesimals to any finite order may exist, in formal smooth sets (def. 40) we may find infinitesimals to any arbitrary finite order:
Let be a formal smooth sets (def. 40) a sub-formal smooth set. Then the infinitesimal neighbourhood to arbitrary infinitesimal order of in is the formal smooth set whose plots are those plots of
such that their reduction (def. 41)
factors through a plot of .
This allows to grasp the restriction of field histories to the infinitesimal neighbourhood of a submanifold of spacetime, which will be crucial for the discussion of phase spaces below.
(field histories on infinitesimal neighbourhood of submanifold of spacetime)
Let be a field bundle (def. 34) and let be a submanifold of spacetime.
We write for its infinitesimal neighbourhood in (def. 27).
Then the set of field histories restricted to , to be denoted
is the set of section of restricted to the infinitesimal neighbourhood (example 27).
We close the discussion of infinitesimal differential geometry by explaining how we may recover the concept of smooth manifolds inside the more general formal smooth sets (def./prop. 44 below). The key point is that the presence of infinitesimals in the theory allows an intrinsic definition of local diffeomorphisms/formally étale morphism (def. 43 and example 28 below). It is noteworthy that the only role this concept plays in the development of field theory below is that smooth manifolds admit triangulations by smooth singular simplices (def. 14) so that the concept of fiber integration of differential forms is well defined over manifolds.
(local diffeomorphism of formal smooth sets)
Let be formal smooth sets (def. 40). Then a morphism between them is called a local diffeomorphism or formally étale morphism, denoted
if if for each infinitesimally thickened Cartesian space (def. 39) we have a natural identification between the -plots of with those -plots of whose reduction (def. 41) comes from an -plot of , hence if we have a natural fiber product of sets of plots
i. e.
for all infinitesimally thickened Cartesian spaces .
Stated more abstractly, this means that the naturality square of the unit of the infinitesimal shape (def. 41) is a pullback square
(Khavkine-Schreiber 17, def. 3.1)
(local diffeomorphism between Cartesian spaces from the general definition)
For two ordinary Cartesian spaces (def. 1), regarded as formal smooth sets by example 23 then a morphism between them is a local diffeomorphism in the general sense of def. 43 precisely if it is so in the ordinary sense of def. 2.
(Khavkine-Schreiber 17, prop. 3.2)
A smooth manifold of dimension is
such that
there exists an indexed set of morphisms of formal smooth sets (def. 40) from Cartesian spaces (def. 1) (regarded as formal smooth sets via example 15, example 18 and example 24) into , (regarded as a formal smooth set via example 18 and example 24) such that
every point is in the image of at least one of the ;
every is a local diffeomorphism according to def. 43;
the final topology induced by the set of plots of makes a paracompact Hausdorff space.
(Khavkine-Schreiber 17, example 3.4)
For more on smooth manifolds from the perspective of formal smooth sets see at geometry of physics -- manifolds and orbifolds.
fermion fields and supergeometry
Field theories of interest crucially involve fermionic fields (def. 50 below), such as the Dirac field (example 35 below), which are subject to the “Pauli exclusion principle”, a key reason for the stability of matter. Mathematically this principle means that these fields have field bundles whose field fiber is not an ordinary manifold, but an odd-graded supermanifold (more on this in remark 17 and remark 27 below).
This “supergeometry” is an immediate generalization of the infinitesimal geometry above, where now the infinitesimal elements in the algebra of functions may be equipped with a grading: one speaks of superalgebra.
The “super”-terminology for something as down-to-earth as the mathematical principle behind the stability of matter may seem unfortunate. For better or worse, this terminology has become standard since the middle of the 20th century. But the concept that today is called supercommutative superalgebra was in fact first considered by Grassmann 1844 who got it right (“Ausdehnungslehre”) but apparently was too far ahead of his time and remained unappreciated.
Beware that considering supergeometry does not necessarily involve considering “supersymmetry”. Supergeometry is the geometry of fermion fields (def. 50 below), and that all matter fields in the observable universe are fermionic has been experimentally established since the Stern-Gerlach experiment in 1922. Supersymmetry, on the other hand, is a hypothetical extension of spacetime-symmetry within the context of supergeometry. Here we do not discuss supersymmetry; for details see instead at geometry of physics -- supersymmetry.
(supercommutative superalgebra)
A real -graded algebra or superalgebra is an associative algebra over the real numbers together with a direct sum decomposition of its underlying real vector space
such that the product in the algebra respects the multiplication in the cyclic group of order 2 :
This is called a supercommutative superalgebra if for all elements which are of homogeneous degree in that
we have
A homomorphism of superalgebras
is a homomorphism of associative algebras over the real numbers such that the -grading is respected in that
For more details on superalgebra see at geometry of physics -- superalgebra.
(basic examples of supercommutative superalgebras)
Basic examples of supercommutative superalgebras (def. 45) include the following:
Every commutative algebra becomes a supercommutative superalgebra by declaring it to be all in even degree: .
For a finite dimensional real vector space, then the Grassmann algebra is a supercommutative superalgebra with and .
More explicitly, if is a Cartesian space with canonical dual coordinates then the Grassmann algebra is the real algebra which is generated from the regarded in odd degree and hence subject to the relation
In particular this implies that all the are infinitesimal (def. 39):
For and two supercommutative superalgebras, there is their tensor product supercommutative superalgebra . For example for a smooth manifold with ordinary algebra of smooth functions regarded as a supercommutative superalgebra by the first example above, the tensor product with a Grassmann algebra (second example above) is the supercommutative superalgebta
whose elements may uniquely be expanded in the form
where the are smooth functions on which are skew-symmetric in their indices.
The following is the direct super-algebraic analog of the definition of infinitesimally thickened Cartesian spaces (def. 39):
A superpoint is represented by a super-commutative superalgebra (def. 45) which as a -graded vector space (super vector space) is a direct sum
of the 1-dimensional even vector space of multiples of 1, with a finite dimensional super vector space that is a nilpotent ideal in in that for each element there exists a natural number such that
More generally, a super Cartesian space is represented by a super-commutative algebra which is the tensor product of algebras of the algebra of smooth functions on an actual Cartesian space of some dimension , with an algebra of functions of a superpoint (example 29).
Specifically, for , there is the superpoint
whose algebra of functions is by definition the Grassmann algebra on generators in odd degree (example 29).
We write
for the corresponding super Cartesian spaces whose algebra of functions is as in example 29.
We say that a smooth function between two super Cartesian spaces
is by definition dually a homomorphism of supercommutative superalgebras (def. 45) of the form
(superpoint induced by a finite dimensional vector space)
Let be a finite dimensional real vector space. With denoting its dual vector space write for the Grassmann algebra that it generates. This being a supercommutative algebra, it defines a superpoint (def. 46).
We denote this superpoint by
All the differential geometry over Cartesian space that we discussed above generalizes immediately to super Cartesian spaces (def. 46) if we stricly adhere to the super sign rule which says that whenever two odd-graded elements swap places, a minus sign is picked up. In particular we have the following generalization of the de Rham complex on Cartesian spaces discussed above.
(super differential forms on super Cartesian spaces)
For a super Cartesian space (def. 46), hence the formal dual of the supercommutative superalgebra of the form
with canonical even-graded coordinate functions and odd-graded coordinate functions .
Then the de Rham complex of super differential forms on is, in super-generalization of def. 11, the -graded commutative algebra
which is generated over from new generators
whose differential is defined in degree-0 by
and extended from there as a bigraded derivation of bi-degree , in super-generalization of def. 12.
Accordingly, the operation of contraction with tangent vector fields (def. 13) has bi-degree if the tangent vector has super-degree :
| generator | bi-degree |
|---|---|
| (0,even) | |
| (0,odd) | |
| (1,even) | |
| (1,odd) |
| derivation | bi-degree |
|---|---|
| (1,even) | |
| (-1, even) | |
| (-1,odd) |
This means that if is in bidegree , and is in bidegree , then
Hence there are two contributions to the sign picked up when exchanging two super-differential forms in the wedge product:
there is a “cohomological sign” which for commuting an -forms past an -form is ;
in addition there is a “super-grading” sign which for commuting a -graded coordinate function past a -graded coordinate function (possibly under the de Rham differential) is .
For example:
(e.g. Castellani-D’Auria-Fré 91 (II.2.106) and (II.2.109), Deligne-Freed 99, section 6)
Beware that there is also another sign rule for super differential forms used in the literature. See at signs in supergeometry for further discussion.
It is clear now by direct analogy with the definition of formal smooth sets (def. 40) what the corresponding supergeometric generalization is. For definiteness we spell it out yet once more:
A super smooth set is
for each super Cartesian space (def. 46) a set
to be called the set of smooth functions or plots from to ;
for each smooth function between super Cartesian spaces a choice of function
to be thought of as the precomposition operation
such that
If is the identity function on , then is the identity function on the set of plots .
If are two composable smooth functions between infinitesimally thickened Cartesian spaces, then pullback of plots along them consecutively equals the pullback along the composition:
i.e.
(gluing)
If is such that
is a differentiably good open cover (def. 3) then the function which restricts -plots of to a set of -plots
is a bijection onto the set of those tuples of plots, which are “matching families” in that they agree on intersections:
i.e.
Finally, given and two super formal smooth sets, then a smooth function between them
is
for each super Cartesian space a function
such that
for each smooth function between super Cartesian spaces we have
i.e.
Basing supergeometry on super formal smooth sets is an instance of the general approach to geometry called functorial geometry or topos theory. For more background on this see at geometry of physics -- supergeometry.
In direct generalization of example 15 we have:
(super Cartesian spaces are super smooth sets)
Let be a super Cartesian space (def. 46) Then it becomes a super smooth set (def. 48) by declaring its plots to the algebra homomorphisms .
Under this identification, morphisms between super Cartesian spaces are in natural bijection with their morphisms regarded as super smooth sets.
Stated more abstractly, this statement is an example of the Yoneda embedding over a subcanonical site.
Similarly, in direct generalization of prop. 17 we have:
(plots of a super smooth set really are the smooth functions into the smooth smooth set)
Let be a super smooth set (def. 48). For any super Cartesian space (def. 46) there is a natural function
from the set of homomorphisms of super smooth sets from (regarded as a super smooth set via example 31) to , to the set of plots of over , given by evaluating on the identity plot .
This function is a bijection.
This says that the plots of , which initially bootstrap into being as declaring the would-be smooth functions into , end up being the actual smooth functions into .
This is the statement of the Yoneda lemma over the site of super Cartesian spaces.
We do not need to consider here supermanifolds more general than the super Cartesian spaces (def. 46). But for those readers familiar with the concept we include the following direct analog of the characterization of smooth manifolds according to def./prop. 44:
A supermanifold of dimension super-dimension is
such that
there exists an indexed set of morphisms of super smooth sets (def. 48) from super Cartesian spaces (def. 46) (regarded as super smooth sets via example 31 into , such that
for every plot there is a differentiably good open cover (def. 3) restricted to which the plot factors through the ;
every is a local diffeomorphism according to def. 43, now with respect not just to infinitesimally thickened points, but with respect to superpoints;
the bosonic part of is a smooth manifold according to def./prop. 44.
Finally we have the evident generalization of the smooth moduli space of differential forms from example 19 to supergeometry
(universal smooth moduli spaces of super differential forms)
For write
for the super smooth set (def. 31) whose set of plots on a super Cartesian space (def. 46) is the set of super differential forms (def. 47) of cohomolgical degree
and whose maps of plots is given by pullback of super differential forms.
The de Rham differential on super differential forms applied plot-wise yields a morpism of super smooth sets
As before in def. 37 we then define for any super smooth set its set of differential -forms to be
and we define the de Rham differential on these to be given by postcomposition with (27).
(bosonic fields and fermionic fields)
For a spacetime, such as Minkowski spacetime (def. 23) if a fiber bundle with total space a super Cartesian space (def. 46) (or more generally a supermanifold, def./prop. 49) is regarded as a super-field bundle (def. 34), then
the even-graded sections are called the bosonic field histories;
the odd-graded sections are called the fermionic field histories.
In components, if is a trivial bundle with fiber a super Cartesian space (def. 46) with even-graded coordinates and odd-graded coordinates , then the are called the bosonic field coordinates, and the are called the fermionic field coordinates.
What is crucial for the discussion of field theory is the following immediate supergeometric analog of the smooth structure on the space of field histories from example 16:
(supergeometric space of field histories)
Let be a super-field bundle (def. 34, def. 50).
Then the space of sections, hence the space of field histories, is the super formal smooth set (def. 48)
whose plots for a given Cartesian space and superpoint (def. 46) with the Cartesian products and regarded as super smooth sets according to example 31 are defined to be the morphisms of super smooth set (def. 48)
which make the following diagram commute:
Explicitly, if is a Minkowski spacetime (def. 23) and a trivial field bundle with field fiber a super vector space (example 9, example 50) this means dually that a plot of the super smooth set of field histories is a homomorphism of supercommutative superalgebras (def. 45)
which make the following diagram commute:
We will focus on discussing the supergeometric space of field histories (example 33) of the Dirac field (def. 35 below). This we consider below in example 35; but first we discuss now some relevant basics of general supergeometry.
Example 33 is really a special case of a general relative mapping space-construction as in example 25. This immediately generalizes also to the supergeometric context.
(super-mapping space out of a super Cartesian space)
Let be a super Cartesian space (def. 46) and let be a super smooth set (def. 48). Then the mapping space
of super smooth functions from to is the super formal smooth set whose -plots are the morphisms of super smooth set from the Cartesian product of super Cartesian space to , hence the -plots of :
In direct generalization of the synthetic tangent bundle construction (example 26) to supergeometry we have
Let be a super smooth set (def. 48) and the superpoint (26) then the supergeometry-mapping space
is called the odd tangent bundle of .
(mapping space of superpoints)
Let be a finite dimensional real vector space and consider its corresponding superpoint from exampe 30. Then the mapping space (def. 51) out of the superpoint (def. 46) into is the Cartesian product
By def. 52 this says that is the “odd tangent bundle” of .
Let be any super Cartesian space. Then by definition we have the following sequence of natural bijections of sets of plots
Here in the third line we used that the Grassmann algebra is free on its generators in , meaning that a homomorphism of supercommutative superalgebras out of the Grassmann algebra is uniquely fixed by the underlying degree-preserving linear function on these generators. Since in a Grassmann algebra all the generators are in odd degree, this is equivalently a linear map from to the odd-graded real vector space underlying , which is the direct sum .
Then in the fourth line we used that finite direct sums are Cartesian products, so that linear maps into a direct sum are pairs of linear maps into the direct summands.
That all these bijections are natural means that they are compatible with morphisms and therefore this says that and are the same as seen by super-smooth plots, hence that they are isomorphic as super smooth sets.
With this supergeometry in hand we finally turn to defining the Dirac field species:
(field bundle for Dirac field)
For being Minkowski spacetime (def. 23), of dimension , , or , let be the spin representation from prop. 15, whose underlying real vector space is
With
the corresponding superpoint (example 30), then the field bundle for the Dirac field on is
hence the field fiber is the superpoint . This is the corresponding spinor bundle on Minkowski spacetime, with fiber in odd super-degree.
The traditional two-component spinor basis from remark 7 provides fermionic field coordinates (def. 50) on the field fiber :
Notice that these are -valued odd functions: For instance if then each in turn has two components, a real part and an imaginary part.
A key point with the field bundle of the Dirac field (example 35) is that the field fiber coordinates or are now odd-graded elements in the function algebra on the field fiber, which is the Grassmann algebra . Therefore they anti-commute with each other:

snippet grabbed from (Dermisek 09)
We analyze the special nature of the supergeometry space of field histories of the Dirac field a little (prop. 53) below and conclude by highlighting the crucial role of supergeometry (remark 10 below)
(space of field histories of the Dirac field)
Let be the super-field bundle (def. 50) for the Dirac field over Minkowski spacetime from example 35.
Then the corresponding supergeometric space of field histories
from example 33 has the following properties:
For an ordinary Cartesian space (with no super-geometric thickening, def. 46) there is only a single -parameterized collection of field histories, hence a single plot
and this corresponds to the zero section, hence to the trivial Dirac field
For a super Cartesian space (46) with a single super-odd dimension, then -parameterized collections of field histories
are in natural bijection with plots of sections of the bosonic-field bundle with field fiber the spin representation regarded as an ordinary vector space:
Moreover, these two kinds of plots determine the fermionic field space completely: It is in fact isomorphic, as a super vector space, to the bosonic field space shifted to odd degree (as in example 30):
In the first case, the plot is a morphism of super Cartesian spaces (def. 46) of the form
By definitions this is dually homomorphism of real supercommutative superalgebras
from the Grassmann algebra on the dual vector space of the spin representation to the ordinary algebras of smooth functions on . But the latter has no elements in odd degree, and hence all the Grassmann generators need to be send to zero.
For the second case, notice that a morphism of the form
is by def. 52 naturally identified with a morphism of the form
where the identification on the right is from example 34.
By the nature of Cartesian products these morphisms in turn are naturally identified with pairs of morphisms of the form
Now, as in the first point above, here the first component is uniquely fixed to be the zero morphism ; and hence only the second component is free to choose. This is precisely the claim to be shown.
(supergeometric nature of the Dirac field)
Proposition 53 how two basic facts about the Dirac field, which may superficially seem to be in tension with each other, are properly unified by supergeometry:
On the one hand a field history of the Dirac field is not an ordinary section of an ordinary vector bundle. In particular its component functions anti-commute with each other, which is not the case for ordinary functions, and this is crucial for the Lagrangian density of the Dirac field to be well defined, we come to this below in example 43.
On the other hand a field history of the Dirac field is supposed to be a spinor, hence a section of a spinor bundle, which is an ordinary vector bundle.
Therefore prop. 53 serves to shows how, even though a Dirac field is not defined to be an ordinary section of an ordinary vector bundle, it is nevertheless encoded by such an ordinary section: One says that this ordinary section is a “superfield-component” of the Dirac field, the one linear in a Grassmann variable .
This concludes our discussion of the concept of fields itself. In the following chapter we consider the variational calculus of fields.
Given a field bundle as in def. 34 above, then we know what type of quantities the corresponding field histories assign to a given spacetime point (a given event). Among all consistent such field histories, some are to qualify as those that “may occur in reality” if we think of the field theory as a means to describe parts of the observable universe. Moreover, if the reality to be described does not exhibit “action at a distance” then admissibility of its field histories should be determined over arbitrary small spacetime regions, in fact over the infinitesimal neighbourhood of any spacetime point (remark 11 below). This means equivalently that the realized field histories should be those that satisfy a given differential equation, namely an equation between the partial derivatives of the field history at any spacetime point. This is called the equation of motion of the field theory (def. 61 below).
In order to formalize this, it is useful to first collect all the possible partial derivatives that a field history may have at any given point into one big space of “field derivatives at spacetime points”. This collection is called the jet bundle of the field bundle, given as def. 54 below.
Moving around in this space means to change the possible value of fields and their derivatives, hence to vary the fields. Accordingly variational calculus of fields is just differential calculus on the jet bundle of the field bundle, this we consider in def. 59 below.
(jet bundle of a trivial vector bundle over Minkowski spacetime)
Given a field fiber super vector space with linear basis , then for a natural number, the order- jet bundle
over Minkowski spacetime of the trivial vector bundle
is the super Cartesian space (def. 46) which is spanned by coordinate functions to be denoted as follows:
where the indices range from 0 to , while the index ranges from to for the even field coordinates, and then from to for the odd-graded field coordinates and the lower indices are symmetric:
In terms of these coordinates the bundle projection map is just the one that remembers the spacetime coordinates and forgets the values of the field and its derivatives . Similarly there are intermediate projection maps
given by forgetting coordinates with more indices.
The infinite-order jet bundle
is the direct limit of super smooth sets (def. 48) over these finite order jet bundles. Explicitly this means that it is the smooth set which is defined by the fact that a smooth function (a plot, by prop. 18)
from some super Cartesian space is equivalently a system of ordinary smooth functions into all the finite-order jet spaces
such that this system is compatible with the above projection maps, i.e. such that
The coordinate functions on a jet bundle (def. 54) are to be thought of as partial derivatives of components of would-be field histories . The power of the jet bundle is that it allows to disentangle relations between would-be partial derivatives of field history components in themselves from consideration of actual field histories. In traditional physics texts this is often done implicitly. We may make it fully explit by the operation of jet prolongation which reads in a field history and records all its partial derivatives in the form of a section of the jet bundle:
Let be a field bundle (def. 34) which happens to be a trivial vector bundle over Minkowski spacetime as in example 9.
There is a smooth function from the space of sections of , the space of field histories (example 33) to the space of sections of the jet bundle (def. 54) which records the field and all its spacetimes derivatives:
This is called the operation of jet prolongation: is the jet prolongation of .
(jet bundle in terms of synthetic differential geometry)
In terms of the infinitesimal geometry of formal smooth sets (def. 40) the jet bundle (def. 54) of a field bundle has the following incarnation:
A section of the jet bundle over a point of spacetime (an event), is equivalently a section of the original field bundle over the infinitesimal neighbourhood of that point (example 27):
Moreover, given a field history , hence a section of the field bundle, then its jet prolongation (def. 55) is that section of the jet bundle which under the above identification is simply the restriction of to the infinitesimal neighbourhood of :
This follows with an argument as in example 20.
Hence in synthetic differential geometry we have:
The jet of a section at is simply the restriction of that section to the infinitesimal neighbourhood of .
(Khavkine-Schreiber 17, section 3.3)
So the canonical coordinates on the jet bundle are the spacetime-point-wise possible values of fields and field derivates, while the jet prolongation picks the actual collections of field derivatives that may occur for an actual field history.
(universal Faraday tensor/field strength on jet bundle)
Consider the field bundle (def. 34) of the electromagnetic field (example 11) over Minkowski spacetime (def. 23), i.e. the cotangent bundle (def. 9) with jet coordinates (def. 54). Consider the functions on the jet bundle given by the linear combinations
of the first order jets.
Then for an electromagnetic field history (“vector potential”), hence a section
with components , its jet prolongation (def. 55)
has components
The pullback of the functions (30) along this jet prolongation are the components of the Faraday tensor of the field (20):
More generally, for a Lie algebra and
the field bundle for Yang-Mills theory from example 12, consider the functions
on the jet bundle given by
where are the structure constants of the Lie algebra as in (21), and where the square brackets around the indices denote anti-symmetrization.
We may call this the universal Yang-Mills field strength, being the covariant exterior derivative of the universal Yang-Mills field history.
For the line Lie algebra and the canonical inner product on the expression (31) reduces to the universal Faraday tensor (30) for the electromagnetic field (example 36).
For a field history of Yang-Mills theory, hence a Lie algebra-valued differential 1-form, then the value of this function on that field are called the components of the covariant exterior derivative or field strength
(universal B-field strength on jet bundle)
Consider the field bundle (def. 34) of the B-field (example 14) over Minkowski spacetime (def. 23) with jet coordinates (def. 54). Consider the functions on the jet bundle given by the linear combinations
where in the last step we used that .
While the jet bundle (def. 54) is not finite dimensional, reflecting the fact that there are arbitrarily high orders of spacetime derivatives of a field histories, it turns out that it is only very “mildly infinite dimensional” in that smooth functions on jet bundles turn out to locally depend on only finitely many of the jet coordinates (i.e. only on a finite order of spacetime derivatives). This is the content of the following prop. 19.
This reflects the locality of Lagrangian field theory defined over jet bundles: If functions on the jet bundle could depend on infinitely many jet coordinates, then by Taylor series expansion of fields the function at one point over spacetime could in fact depend on field history values at a different point of spacetime. Such non-local dependence is ruled out by prop. 19 below.
In practice this means that the situation is very convenient:
Any given local Lagrangian density (which will define a field theory, we come to this in def. 60 below) will locally depend on some finite number of derivatives and may hence locally be treated as living on the ordinary manifold .
while at the same time all formulas (such as for the Euler-Lagrange equations, def. 61) work uniformly without worries about fixing a maximal order of derivatives.
(jet bundle is a locally pro-manifold)
Given a jet bundle as in def. 54, then a smooth function out of it
is such that around each point of there is a neighbourhood on which it is given by a function on a smooth function on for some finite .
(see Khavkine-Schreiber 17, section 2.2 and 3.3)
Example 36 shows that the de Rham differential (def. 12) may be encoded in terms of composing jet prolongation with a suitable function on the jet bundle. More generally, jet prolongation neatly encodes (possibly non-linear) differential operators:
Let and be two smooth fiber bundles over a common base space . Then a (possibly non-linear) differential operator from sections of to sections of is a bundle morphism from the jet bundle of (def. 54) to :
or rather the function between the spaces of sections of these bundles which this induces after composition with jet prolongation (def. 55):
If both and are vector bundles (def. 7) so that their spaces of sections canonically are vector spaces, then is called a linear differential operator if it is a linear function between these vector spaces. This means equivalently that is a linear function in jet coordinates.
(normally hyperbolic differential operator on Minkowski spacetime)
Let be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23). Write for its dual vector bundle (def. 8)
A linear differential operator (def. 56)
is of second order if it has a coordinate expansion of the form
for smooth functions on .
This is called a normally hyperbolic differential operator if its principal symbol is proportional to the inverse Minkowski metric (prop./def. 10) , i.e.
(formally adjoint differential operators)
Let be a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23) and write for the dual vector bundle (def. 8).
Then a pair of linear differential operators (def. 56) of the form
are called formally adjoint differential operators via a bilinear differential operator
with values in differential p-forms (def. 11) such that for all sections we have
where is the volume form on Minkowski spacetime (10) and where denoted the de Rham differential (def. 12).
This implies by Stokes' theorem (prop. 4) in the case of compact support that under an integral and are related via integration by parts.
(variational calculus – replacing plain bundle morphisms by differential operators)
Various concepts in variational calculus, especially the concept of evolutionary vector fields (def. 64 below) and gauge parameterized implicit infinitesimal gauge symmetries (def. 23 below) follow from concepts in plain differential geometry by systematically replacing plain bundle morphisms by bundle morphisms out of the jet bundle, hence by differential operators as in def. 56.
(variational derivative and total spacetime derivative – the variational bicomplex)
On the jet bundle of a trivial super vector space-vector bundle over Minkowski spacetime as in def. 54 we may consider its de Rham complex of super differential forms (def. 47); we write its de Rham differential (def. 12) in boldface:
Since the jet bundle unifies spacetime with field values, we want to decompose this differential into a contribution coming from forming the total derivatives of fields along spacetime (“horizontal derivatives”), and actual variation of fields at a fixed spacetime point (“vertical derivatives”):
The total spacetime derivative or horizontal derivative on is the map on differential forms on the jet bundle of the form
which on functions (i.e. on 0-forms) is defined by
and extended to all forms by the graded Leibniz rule, hence as a nilpotent derivation of degree +1.
The variational derivative or vertical derivative
is what remains of the full de Rham differential when the total spacetime derivative (horizontal derivative) is subtracted:
We may then extend the horizontal derivative from functions on the jet bundle to all differential forms on the jet bundle by declaring that
which by (36) is equivalent to
For example
This defines a bigrading on the de Rham complex of , into horizontal degree and vertical degree
such that the horizontal and vertical derivative increase horizontal or vertical degree, respectively:
This is called the variational bicomplex.
Accordingly we will refer to the differential forms on the jet bundle often as variational differential forms.
(basic facts about variational calculus)
Given the jet bundle of a field bundle as in def. 54, then in its variational bicomplex (def. 59) we have the following:
The spacetime total derivative (horizontal derivative) of a spacetime coordinate function coincides with its ordinary de Rham differential
which hence is a horizontal 1-form
Therefore the variational derivative (vertical derivative) of a spacetime coordinate function vanishes:
reflective the fact that is not a field coordinate that could be varied.
In particular the given volume form on gives a horizontal -form on the jet bundle, which has the same coordinate expression (and which we denote by the same symbol)
Generally any horizontal -form is of the form
for
any smooth function of the spacetime coordinates and the field coordinates (locally depending only on a finite order of these, by prop. 19).
In particular every horizontal -form is proportional to the above volume form
for some smooth function that may depend on all the spacetime and field coordinates.
The spacetimes total derivatives /horizontal derivatives) of the variational derivative (vertical derivative) of a field variable is the differential 2-form of horizontal degree 1 and vertical degree 1 given by
In words this says that “the spacetime derivative of the variation of the field is the variation of its spacetime derivative”.
The following are less trivial properties of variational differential forms:
(pullback along jet prolongation compatible with total spacetime derivatives)
Let be a field bundle over a spacetime (def. 34), with induced jet bundle (def. 54).
Then for any field history, the pullback of differential forms (def. 2)
along the jet prolongation of (def. 55)
intertwines the de Rham differential on spacetime (def. 9) with the total spacetime derivative (horizontal derivative) on the jet bundle (def. 59):
annihilates all vertical differential forms (def. 59):
The operation of pullback of differential forms along any smooth function intertwines the full de Rham differentials (prop. 2). In particular we have that
This means that the second statement immediately follows from the first, by definition of the variational (vertical) derivative as the difference between the full de Rham differential and the horizontal one:
It remains to see the first statement:
Since the jet prolongation preserves the spacetime coordinates (being a section of the jet bundle) it is immediate that the claimed relation is satisfied on the horizontal basis 1-forms (example 38):
Therefore it finally remains only to check the first statement on smooth functions (0-forms). So let
be a smooth function on the jet bundle. Then by the chain rule
That this is equal to follows by the very definition of the total spacetime derivative of (34).
(horizontal variational complex of trivial field bundle is exact)
Let be a field bundle which is a trivial vector bundle over Minkowski spacetime (example 9). Then the chain complex of horizontal differential forms with the total spacetime derivative (horizontal derivative) (def. 59)
is exact: for all the kernel of coincides with the image of in .
More explicitly, this means that not only is every horizontally exact differential form horizontally closed (which follows immediately from the fact that we have a cochain complex in the first place, hence that ), but, conversely, if satisfies , then there exists with .
(e.g. Anderson 89, prop. 4.3)
We will encounter the extension of the exact sequence (40) further steps to the right below in example 50.
This concludes our discussion of variational calculus on the jet bundle of the field bundle. In the next chapter we apply this to Lagrangian densities on the jet bundle, defining Lagrangian field theories.
Given any type of fields (def. 34), those field histories that are to be regarded as “physically realizable” (if we think of the field theory as a description of the observable universe) should satisfy some differential equation – the equation of motion – meaning that realizability of any field histories may be checked upon restricting the configuration to the infinitesimal neighbourhoods (example 27) of each spacetime point. This expresses the physical absence of “action at a distance” and is one aspect of what it means to have a local field theory. By remark 11 this means that equations of motion of a field theory are equations among the coordinates of the jet bundle of the field bundle.
For many field theories of interest, their differential equation of motion is not a random partial differential equations, but is of the special kind that exhibits the “principle of extremal action” (prop. 45 below) determined by a local Lagrangian density (def. 60 below). These are called Lagrangian field theories, and this is what we consider here.
Namely among all the variational differential forms (def. 59) two kinds stand out, namley the 0-forms in – the smooth functions – and the horizontal -forms – to be called the Lagrangian densities (def. 60 below) – since these occupy the two “corners” of the variational bicomplex (38). There is not much to say about the 0-forms, but the Lagrangian densities do inherit special structure from their special position in the variational bicomplex:
Their variational derivative uniquely decomposes as
the Euler-Lagrange derivative which is proportional to the variation of the fields (instead of their derivatives)
the total spacetime derivative of a potential for a presymplectic current .
This is prop. 22 below:
These two terms play a pivotal role in the theory: The condition that the first term vanishes on field histories is a differential equation on field histories, called the Euler-Lagrange equation of motion (def. 61 below). The space of solutions to this differential equation, called the on-shell space of field histories
has the interpretation of the space of “physically realizable field histories”. This is the key object of study in the following chapters. Often this is referred to as the space of classical field histories, indicating that this does not yet reflect the full quantum field theory.
Indeed, there is also the second term in the variational derivative of the Lagrangian density, the presymplectic current , and this implies a presymplectic structure on the on-shell space of field histories (def. 88 below) which encodes deformations of the algebra of smooth functions on . This deformation is the quantization of the field theory to an actual quantum field theory, which we discuss below.
Given a field bundle over a -dimensional Minkowski spacetime as in example 9, then a local Lagrangian density (for the type of field thus defined) is a horizontal differential form of degree (def. 59) on the corresponding jet bundle (def. 54):
By example 38 in terms of the given volume form on spacetimes, any such Lagrangian density may uniquely be written as
where the coefficient function (the Lagrangian function) is a smooth function on the spacetime and field coordinates:
where by prop. 19 depends locally on an arbitrary but finite order of derivatives .
We say that a field bundle (def. 34) equipped with a local Lagrangian density is (or defines) a prequantum Lagrangian field theory on the spacetime .
(parameterized and physical unit-less Lagrangian densities)
More generally we may consider parameterized collections of Lagrangian densities, i.e. functions
for some Cartesian space or generally some super Cartesian space.
For example all Lagrangian densities considered in relativistic field theory are naturally smooth functions of the scale of the metric (def. 10)
But by the discussion in remark 3, in physics a rescaling of the metric is interpreted as reflecting but a change of physical units of length/distance. Hence if a Lagrangian density is supposed to express intrinsic content of a physical theory, it should remain unchanged under such a change of physical units.
This is achieved by having the Lagrangian be parameterized by further parameters, whose corresponding physical units compensate that of the metric such as to make the Lagrangian density “physical unit-less”.
This means to consider parameter spaces equipped with an action of the multiplicative group of positive real numbers, and parameterized Lagrangians
(locally variational field theory and Lagrangian p-gerbe connection)
If the field bundle (def. 34) is not just a trivial vector bundle over Minkowski spacetime (example 9) then a Lagrangian density for a given equation of motion may not exist as a globally defined differential -form, but only as a p-gerbe connection. This is the case for locally variational field theories such as the charged particle, the WZW model and generally theories involving higher WZW terms. For more on this see the exposition at Higher Structures in Physics.
(local Lagrangian density for free real scalar field on Minkowski spacetime)
Consider the field bundle for the real scalar field from example 10, i.e. the trivial line bundle over Minkowski spacetime.
According to def. 54 its jet bundle has canonical coordinates
In these coordinates, the local Lagrangian density (def. 60) defining the free real scalar field of mass on is
This is naturally thought of as a collection of Lagrangians smoothly parameterized by the metric and the mass . For this to be physical unit-free in the sense of remark 13 the physical unit of the parameter must be that of the inverse metric, hence must be an inverse length according to remark 3 This is the inverse Compton wavelength (9) and hence the physical unit-free version of the Lagrangian density for the free scalar particle is
(local Lagrangian density for free electromagnetism)
Consider the field bundle for the electromagnetic field on Minkowski spacetime from example 11, i.e. the cotangent bundle, which over Minkowski spacetime happens to be a trivial vector bundle of rank . With fiber coordinates taken to be , the induced fiber coordinates on the corresponding jet bundle (def. 54) are .
Consider then the local Lagrangian density (def. 60) given by
where are the components of the universal Faraday tensor on the jet bundle from example 36.
This is the Lagrangian density that defines the Lagrangian field theory of free electromagnetism.
Here for an electromagnetic field history (vector potential), then the pullback of along its jet prolongation (def. 55) is the corresponding component of the Faraday tensor (20):
It follows that the pullback of the Lagrangian (42) along the jet prologation of the electromagnetic field is
Here denotes the Hodge star operator of Minkowski spacetime.
More generally:
(Lagrangian density for Yang-Mills theory on Minkowski spacetime)
Let be a finite dimensional Lie algebra which is semisimple. This means that the Killing form invariant polynomial
is a non-degenerate bilinear form. Examples include the special unitary Lie algebras .
Then for the field bundle for Yang-Mills theory as in example 12, the Lagrangian density (def. 60) -Yang-Mills theory on Minkowski spacetime is
where
is the universal Yang-Mills field strength (31).
(local Lagrangian density for free B-field)
Consider the field bundle for the B-field on Minkowski spacetime from example 14. With fiber coordinates taken to be with
the induced fiber coordinates on the corresponding jet bundle (def. 54) are .
Consider then the local Lagrangian density (def. 60) given by
where are the components of the universal B-field strength on the jet bundle from example 37.
(Lagrangian density for free Dirac field on Minkowski spacetime)
For Minkowski spacetime of dimension (def. 23), consider the field bundle for the Dirac field from example 35. With the two-component spinor field fiber coordinates from remark 7, the jet bundle has induced fiber coordinates as follows:
All of these are odd-graded elements (def. 45) in a Grassmann algebra (example 29), hence anti-commute with each other, in generalization of (28):
The Lagrangian density (def. 60) of the massless free Dirac field on Minkowski spacetime is
given by the bilinear pairing from prop. 16 of the field coordinate with its first spacetime derivative and expressed here in two-component spinor field coordinates as in (15), hence with the Dirac conjugate (14) on the left.
Specifically in spacetime dimension , the Lagrangian function for the massive Dirac field of mass is
This is naturally thought of as a collection of Lagrangians smoothly parameterized by the metric and the mass . For this to be physical unit-free in the sense of remark 13 the physical unit of the parameter must be that of the inverse metric, hence must be an inverse length according to remark 3 This is the inverse Compton wavelength (9) and hence the physical unit-free version of the Lagrangian density for the free Dirac field is
(reality of the Lagrangian density of the Dirac field)
The kinetic term of the Lagrangian density for the Dirac field form def. 43 is a sum of two contributions, one for each chiral spinor component in the full Dirac spinor (remark 7):
Here the computation shown under the brace crucially uses that all these jet coordinates for the Dirac field are anti-commuting, due to their supergeometric nature (44).
Notice that a priori this is a function on the jet bundle with values in . But in fact for it is real up to a total spacetime derivative:, because
and similarly for
(e.g. Dermisek I-9)
The beauty of Lagrangian field theory (def. 60) is that a choice of Lagrangian density determines both the equations of motion of the fields as well as a presymplectic structure on the space of solutions to this equation (the “shell”), making it the “covariant phase space” of the theory. All this we discuss below. But in fact all this key structure of the field theory is nothing but the shadow (under “transgression of variational differential forms”, def. 82 below) of the following simple relation in the variational bicomplex:
(Euler-Lagrange form and presymplectic current)
Given a Lagrangian density as in def. 60, then its de Rham differential , which by degree reasons equals , has a unique decomposition as a sum of two terms
such that is proportional to the variational derivative of the fields (but not their derivatives, called a “source form”):
The map
thus defined is called the Euler-Lagrange operator and is explicitly given by the Euler-Lagrange derivative:
The smooth subspace of the jet bundle on which the Euler-Lagrange form vanishes
is called the shell. The smaller subspace on which also all total spacetime derivatives vanish (the “formally integrable prolongation”) is the prolonged shell
Saying something holds “on-shell” is to mean that it holds after restriction to this subspace. For example a variational differential form is said to vanish on shell if .
The remaining term in (46) is unique, while the presymplectic potential
is not unique.
(For a field bundle which is a trivial vector bundle (example 9 over Minkowski spacetime (def. 23), prop. 21 says that is unique up to addition of total spacetime derivatives , for .)
One possible choice for the presymplectic current is
where
denotes the contraction (def. 13) of the volume form with the vector field .
The vertical derivative of a chosen presymplectic potential is called a pre-symplectic current for :
Given a choice of then the sum
is called the corresponding Lepage form. Its de Rham derivative is the sum of the Euler-Lagrange variation and the presymplectic current:
(Its conceptual nature will be elucidated after the introduction of the local BV-complex in example 75 below.)
Using and that by degree reasons (example 38), we find
The idea now is to have pick up those terms that would appear as boundary terms under the integral if we were to consider integration by parts to remove spacetime derivatives of .
We compute, using example 38, the total horizontal derivative of from (51) as follows:
where in the last line we used that
Here the two terms proportional to cancel out, and we are left with
Hence shares with the terms that are proportional to for , and so the remaining terms are proportional to , as claimed:
The following fact is immediate from prop. 22, but of central importance, we futher amplify this in remark 16 below:
(total spacetime derivative of presymplectic current vanishes on-shell)
Let be a Lagrangian field theory (def. 60). Then the Euler-Lagrange form and the presymplectic current (prop. 22) are related by
In particular this means that restricted to the prolonged shell (49) the total spacetime derivative of the presymplectic current vanishes:
By prop. 22 we have
The claim follows from applying the variational derivative to both sides, using (37): and .
Many examples of interest fall into the following two special cases of prop. 22:
(Euler-Lagrange form for spacetime-independent Lagrangian densities)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle over Minkowski spacetime (example 9).
In general the Lagrangian density is a function of all the spacetime and field coordinates
Consider the special case that is spacetime-independent in that the Lagrangian funtion is independent of the spacetime coordinate . Then the same evidently holds for the Euler-Lagrange form (prop. 22). Therefore in this case the shell (49) is itself a trivial bundle over spacetime.
In this situation every point in the jet fiber defines a constant section of the shell:
Consider a Lagrangian field theory (def. 60) whose Lagrangian density
does not depend on the spacetime-coordinates (example 24);
depends on spacetime derivatives of field coordinates (hence on jet bundle coordinates) at most to first order.
Hence if the field bundle is a trivial vector bundle over Minkowski spacetime (example 9) this means to consider the case that
Then the presymplectic current (def. 22) is (up to possibly a horizontally exact part) of the form
where
denotes the partial derivative of the Lagrangian function with respect to the spacetime-derivatives of the field coordinates.
Here
is called the canonical momentum corresponding to the “canonical field coordinate” .
In the language of multisymplectic geometry the full expression
is also called the “canonical multi-momentum”, or similar.
We compute:
Hence
(presymplectic current is local version of (pre-)symplectic form of Hamiltonian mechanics)
In the simple but very common situation of example 44 the presymplectic current (def. 22) takes the form (58)
with the field coordinates (“canonical coordinates”) and the “canonical momentum” (58).
Notice that this is of the schematic form “”, which is reminiscent of the wedge product of a symplectic form expressed in Darboux coordinates with a volume form for a -dimensional manifold. Indeed, below in Phase space we discuss that this presymplectic current “transgresses” (def. 82 below) to a presymplectic form of the schematic form “” on the on-shell space of field histories (def. 61) by integrating it over a Cauchy surface of dimension . In good situations this presymplectic form is in fact a symplectic form on the on-shell space of field histories (theorem 2 below).
This shows that the presymplectic current is the local (i.e. jet level) avatar of the symplectic form that governs the formulation of Hamiltonian mechanics in terms of symplectic geometry.
In fact prop. 23 may be read as saying that the presymplectic current is a conserved current (def. 66 below), only that it takes values not in smooth functions of the field coordinates and jets, but in variational 2-forms on fields. There is a conserved charge associated with every conserved current (prop. 48 below) and the conserved charge associated with the presymplectic current is the (pre-)symplectic form on the phase space of the field theory (def. 88 below).
(Euler-Lagrange form and presymplectic current for free real scalar field)
Consider the Lagrangian field theory of the free real scalar field from example 39.
Then the Euler-Lagrange form and presymplectic current (prop. 22) are
and
respectively.
This is a special case of example 44, but we spell it out in detail again:
We need to show that Euler-Lagrange operator takes the local Lagrangian density for the free scalar field to
First of all, using just the variational derivative (vertical derivative) is a graded derivation, the result of applying it to the local Lagrangian density is
By definition of the Euler-Lagrange operator, in order to find and , we need to exhibit this as the sum of the form .
The key to find is to realize as a total spacetime derivative (horizontal derivative). Since this is accomplished by
where on the right we have the contraction (def. 13) of the tangent vector field along into the volume form.
Hence we may take the presymplectic potential (50) of the free scalar field to be
because with this we have
In conclusion this yields the decomposition of the vertical differential of the Lagrangian density
which shows that is as claimed, and that is a presymplectic potential current (50). Hence the presymplectic current itself is
(Euler-Lagrange form for free electromagnetic field)
Consider the Lagrangian field theory of free electromagnetism from example 40.
By (47) we have
More generally:
(Euler-Lagrange form for Yang-Mills theory on Minkowski spacetime)
Let be a semisimple Lie algebra and consider the Lagrangian field theory of -Yang-Mills theory from example 41.
Its Euler-Lagrange form (prop. 22) is
where
is the universal Yang-Mills field strength (31).
With the explicit form (47) for the Euler-Lagrange derivative we compute as follows:
In the last step we used that for a semisimple Lie algebra is totally skew-symmetric in its indices (this being the coefficients of the Lie algebra cocycle) which is in transgression with the Killing form invariant polynomial .
(Euler-Lagrange form of free B-field)
Consider the Lagrangian field theory of the free B-field from example 14.
The Euler-Lagrange variational derivative is
where is the universal B-field strength from example 37.
By (47) we have
(Euler-Lagrange form and presymplectic current of Dirac field)
Consider the Lagrangian field theory of the Dirac field on Minkowski spacetime of dimension (example 43).
Then
the Euler-Lagrange variational derivative (def. 22) in the case of vanishing mass is
and in the case that spacetime dimension is and arbitrary mass , it is
its presymplectic current (def. 22) is
In any case the canonical momentum of the Dirac field according to example 44 is
This yields the presymplectic current as claimed, by example 44.
Now regarding the Euler-Lagrange form, first consider the massless case in spacetime dimension , where
Then we compute as follows:
Here the first equation is the general formula (47) for the Euler-Lagrange variation, while the identity under the braces combines two facts (as in remark 17 above):
the anti-commutativity (44) of the Dirac field and jet coordinates, due to their supergeometric nature (remark 10).
Finally in the special case of the massive Dirac field in spacetime dimension the Lagrangian function is
where now takes values in the complex numbers (as opposed to in , or ). Therefore we may now form the derivative equivalently by treeating and as independent components of the field. This immediately yields the claim.
(trivial Lagrangian densities and the Euler-Lagrange complex)
If a Lagrangian density (def. 39) is in the image of the total spacetime derivative, hence horizontally exact (def. 59)
for any , then both its Euler-Lagrange form as well as its presymplectic current (def. 22) vanish:
This is because with (37) the defining unique decomposition (46) of is given by
which then implies with (52) that
Therefore the Lagrangian densities which are total spacetime derivatives are also called trivial Lagrangian densities.
If the field bundle is a trivial vector bundle (example 9) over Minkowski spacetime (def. 23) then also the converse is true: Every Lagrangian density whose Euler-Lagrange form vanishes is a total spacetime derivative.
Stated more abstractly, this means that the exact sequence of the total spacetime from prop. 21 extends to the right via the Euler-Lagrange variational derivative to an exact sequence of the form
In fact, as shown, this exact sequence keeps going to the right; this is also called the Euler-Lagrange complex.
The next differential after the Euler-Lagrange variational derivative is known as the Helmholtz operator. By definition of exact sequence, the Helmholtz operator detects whether a partial differential equation on field histories, induced by a variational differential form as in (61) comes from varying a Lagrangian density, hence whether it is the equation of motion of a Lagrangian field theory via def. 61.
This way homological algebra is brought to bear on core questions of field theory. For more on this see the exposition at Higher Structures in Physics.
(supergeometric nature of Lagrangian density of the Dirac field)
Observe that the Lagrangian density for the Dirac field (def. 43) makes sense (only) due to the supergeometric nature of the Dirac field (remark 10): If the field jet coordinates were not anti-commuting (44) then the Dirac’s field Lagrangian density (def. 43) would be a total spacetime derivative and hence be trivial according to example 50.
This is because
Here the identification under the brace uses two facts:
the symmetry (12) of the spinor bilinear pairing ;
the anti-commutativity (44) of the Dirac field and jet coordinates, due to their supergeometric nature (remark 10).
The second fact gives the minus sign under the brace, which makes the total expression vanish, if the Dirac field and jet coordinates indeed are anti-commuting (which, incidentally, means that we found an “off-shell conserved current” for the Dirac field, see example 55 below).
If however the Dirac field and jet coordinates did commute with each other, we would instead have a plus sign under the brace, in which case the total horizontal derivative expression above would equal the massless Dirac field Lagrangian (45), thus rendering it trivial in the sense of example 50.
The same supergeometric nature of the Dirac field will be necessary for its intended equation of motion, the Dirac equation (example 52) to derive from a Lagrangian density; see the proof of example 49 below, and see remark 27 below.
The key implication of the Euler-Lagrange form on the jet bundle is that it induces the equation of motion on the space of field histories:
(Euler-Lagrange equation of motion)
Given a Lagrangian field theory (def. 60 then the corresponding Euler-Lagrange equations of motion is the condition on field histories (def. 33)
to have a jet prolongation (def. 55)
that factors through the shell inclusion (48) defined by vanishing of the Euler-Lagrange form (prop. 22)
(This implies that factors even through the prolonged shell (49).)
In the case that the field bundle is a trivial vector bundle over Minkowski spacetime as in example 9 this is the condition that satisfies the following differential equation (again using prop. 22):
The on-shell space of field histories is the space of solutions to this condition, namely the the sub-super smooth set (def. 48) of the full space of field histories (22) (def. 33)
whose plots are those that factor through the shell (61).
More generally for a submanifold of spacetime, we write
for the sub-super smooth ste of on-shell field histories restricted to the infinitesimal neighbourhood of in (25).
A Lagrangian field theory (def. 60) with field bundle a vector bundle (e.g. a trivial vector bundle as in example 9) is called a free field theory if its Euler-Lagrange equations of motion (def. 61) is a differential equation that is linear differential equation, in that with
any two on-shell field histories (62) and any two real numbers, also the linear combination
which a priori exists only as an element in the off-shell space of field histories, is again a solution to the equations of motion and hence an element of .
A Lagrangian field theory which is not a free field theory is called an interacting field theory.
(relevance of free field theory)
In perturbative quantum field theory one considers interacting field theories in the infinitesimal neighbourhood (example 27) of free field theories (def. 62) inside some super smooth set of general Lagrangian field theories. While free field theories are typically of limited interest in themselves, this perturbation theory around them exhausts much of what is known about quantum field theory in general, and therefore free field theories are of paramount importance for the general theory.
We discuss the covariant phase space of free field theories below in Propagators and their quantization below in Free quantum fields.
(equation of motion of free real scalar field is Klein-Gordon equation)
Consider the Lagrangian field theory of the free real scalar field from example 39.
By example 45 its Euler-Lagrange form is
Hence for a field history, its Euler-Lagrange equation of motion according to def. 61 is
often abbreviated as
This PDE is called the Klein-Gordon equation on Minowski spacetime. If the mass vanishes, , then this is the relativistic wave equation.
Hence this is indeed a free field theory according to def. 62.
The corresponding linear differential operator (def. 56)
is called the Klein-Gordon operator.
For later use we record the following basic fact about the Klein-Gordon equation:
(Klein-Gordon operator is formally self-adjoint )
The Klein-Gordon operator (65) is its own formal adjoint (def. 58) witnessed by the bilinear differential operator (33) given by
(equations of motion of vacuum electromagnetism are vacuum Maxwell's equations)
Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime from example 40.
By example 46 its Euler-Lagrange form is
Hence for a field history (“vector potential”), its Euler-Lagrange equation of motion according to def. 61 is
where is the Faraday tensor (20). (In the coordinate-free formulation in the second line “” denotes the Hodge star operator induced by the pseudo-Riemannian metric on Minkowski spacetime.)
These PDEs are called the vacuum Maxwell's equations.
This, too, is a free field theory according to def. 62.
(equation of motion of Dirac field is Dirac equation)
Consider the Lagrangian field theory of the Dirac field on Minkowski spacetime from example 43, with field fiber the spin representation regarded as a superpoint and Lagrangian density given by the spinor bilinear pairing
(in spacetime dimension with unless ).
From example 49 it follows that the corresponding Euler-Lagrange equation of motion (def. 61) is
This is the Dirac equation. In terms of the Feynman slash notation from (16) the corresponding differential operator, the Dirac operator reads
Hence this is a free field theory according to def. 62.
Observe that the “square” of the Dirac operator is the Klein-Gordon operator (64)
This means that a Dirac field which solves the Dirac equations is in particular (on Minkowski spacetime) componentwise a solution to the Klein-Gordon equation.
(supergeometric nature of the Dirac equation as an Euler-Lagrange equation)
While the Dirac equation (67) of example 52 would make sense in itself also if the field coordinates and jet coordinates of the Dirac field were not anti-commuting (44), due to their supergeometric nature (remark 10), it would, by remark 17, then no longer be the Euler-Lagrange equation of a Lagrangian density, hence then Dirac field theory would not be a Lagrangian field theory.
(Dirac operator on Dirac spinors is formally self-adjoint differential operator)
The Dirac operator, hence the differential operator corresponding to the Dirac equation of example 52 via def. 56 is a formally anti-self adjoint (def. 58):
Regard the Dirac operator as taking values in the dual spin bundle by using the Dirac conjugate (14):
Then we need to show that there is such that for all pairs of spinor sections we have
But the spinor-to-vector pairing is symmetric (12), hence this is equivalent to
By the product law of differentiation, this is solved, for all , by
This concludes our discussion of Lagrangian densities and their variational calculus. In the next chapter we consider the infinitesimal symmetries of Lagrangians.
We have introduced the concept of Lagrangian field theories in terms of a field bundle equipped with a Lagrangian density on its jet bundle (def. 60). Generally, given any object equipped with some structure, it is of paramount interest to determine the symmetries, hence the isomorphisms/equivalences of the object that preserve the given structure (this is the “Erlanger program”, Klein 1872).
The infinitesimal symmetries of the Lagrangian density (def. 66 below) send one field history to an infinitesimally nearby one which is “equivalent” for all purposes of field theory. Among these are the infinitesimal gauge symmetries which will be of concern below. A central theorem of variational calculus says that infinitesimal symmetries of the Lagrangian correspond to conserved currents, this is Noether's theorem I, prop. 30 below. These conserved currents constitute an extension of the Lie algebra of symmetries, called the Dickey bracket.
But in (54) we have seen that the Lagrangian density of a Lagrangian field theory is just one component, in codimension 0, of an inhomogeneous “Lepage form” which in codimension 1 is given by the presymplectic potential current (50). (This will be conceptually elucidated, after we have introduced the local BV-complex, in example 75 below.) This means that in codimension 1 we are to consider infinitesimal on-shell symmetries of the Lepage form . These are known as Hamiltonian vector fields (def. 70 below) and the analog of Noether's theorem I now says that these correspond to Hamiltonian differential forms. The Lie algebra of these infinitesimal symmetries is called the local Poisson bracket (prop. 36 below).
Noether theorem and Hamiltonian Noether theorem
| variational form | symmetry | homotopy formula | physical quantity | local symmetry algebra |
|---|---|---|---|---|
| Lagrangian density (def. 60) | conserved current (def. 66) | Dickey bracket | ||
| presymplectic current (prop. 22) | Hamiltonian form (def. 70) | local Poisson bracket (prop. 36) |
In Phase space below we transgress this local Poisson bracket of infinitesimal symmetries of the presymplectic potential current to the “global” Poisson bracket on the covariant phase space (def. 90 below). This is the structure which then further below leads over to the quantization (deformation quantization) of the prequantum field theory to a genuine perturbative quantum field theory. However, it will turn out that there may be an obstruction to this construction, namely the existence of special infinitesimal symmetries of the Lagrangian densities, called implicit gauge symmetries (discussed further below).
We now discuss these topics:
infinitesimal symmetries of the Lagrangian density
(variation)
Let be a field bundle (def. 34).
A variation is a vertical vector field on the jet bundle (def. 54) hence a vector field which vanishes when evaluated in the horizontal differential forms.
In the special case that the field bundle is trivial vector bundle over Minkowski spacetime as in example 9, a variation is of the form
The concept of variation in def. 63 is very general, in that it allows to vary the field coordinates independently from the corresponding jets. This generality is necessary for discussion of symmetries of presymplectic currents in def. 70 below. But for discussion of symmetries of Lagrangian densities we are interested in explicitly varying just the field coordinates (def. 64 below) and inducing from this the corresponding variations of the field derivatives (prop. 28) below.
In order to motivate the following definition 64 of evolutionary vector fields we follow remark 12 saying that concepts in variational calculus are obtained from their analogous concepts in plain differential calculus by replacing plain bundle morphisms by morphism out of the jet bundle:
Given a fiber bundle , then a vertical vector field on is a section of its vertical tangent bundle (def. 6), hence is a bundle morphism of this form
The variational version replaces the vector bundle on the left with its jet bundle:
Let be a field bundle (def. 34). Then an evolutionary vector field on is “variational vertical vector field” on , hence a smooth bundle homomorphism out of the jet bundle (def. 54)
to the vertical tangent bundle (def. 6) of .
In the special case that the field bundle is a trivial vector bundle over Minkowski spacetime as in example 9, this means that an evolutionary vector field is a tangent vector field (example 5) on of the special form
where the coefficients are general smooth functions on the jet bundle (while the cmponents are tangent vectors along the field coordinates , but not along the spacetime coordinates and not along the jet coordinates ).
We write
for the space of evolutionary vector fields, regarded as a module over the -algebra
of smooth functions on the jet bundle.
An evolutionary vector field (def. 64) describes an infinitesimal change of field values depending on, possibly, the point in spacetime and the values of the field and all its derivatives (locally to finite order, by prop. 19).
This induces a corresponding infinitesimal change of the derivatives of the fields, called the prolongation of the evolutionary vector field:
(prolongation of evolutionary vector field)
Let be a fiber bundle.
Given an evolutionary vector field on (def. 64) there is a unique tangent vector field (example 5) on the jet bundle (def. 54) such that
agrees on field coordinates (as opposed to jet coordinates) with :
which means in the special case that is a trivial vector bundle over Minkowski spacetime (example 9) that is of the form
contraction with (def. 13) anti-commutes with the total spacetime derivative (def. 59):
In particular Cartan's homotopy formula (prop. 3) for the Lie derivative holds with respect to the variational derivative :
Explicitly, in the special case that the field bundle is a trivial vector bundle over Minkowski spacetime (example 9) is given by
It is sufficient to prove the coordinate version of the statement. We prove this by induction over the maximal jet order . Notice that the coefficient of in is given by the contraction (def. 13).
Similarly (at “”) the component of is given by . But by the second condition above this vanishes:
Moreover, the coefficient of in is fixed by the first condition above to be
This shows the statement for . Now assume that the statement is true up to some . Observe that the coefficients of all are fixed by the contractions with . For this we find again from the second condition and using as well as the induction assumption that
This shows that satisfying the two conditions given exists uniquely.
Finally formula (70) for the Lie derivative follows from the second of the two conditions with Cartan's homotopy formula (prop. 3) together with (35).
(evolutionary vector fields form a Lie algebra)
Let be a fiber bundle. For any two evolutionary vector fields , on (def. 64) the Lie bracket of tangent vector fields of their prolongations , (def. 28) is itself the prolongation of a unique evolutionary vector field .
This defines the structure of a Lie algebra on evolutionary vector fields.
It is clear that is still vertical, therefore, by prop. 28, it is sufficient to show that contraction with this vector field (def. 13) anti-commutes with the horizontal derivative , hence that .
Now is an operator that sends vertical 1-forms to horizontal 1-forms and vanishes on horizontal 1-forms. Therefore it is sufficient to see that this operator in fact also vanishes on all vertical 1-forms. But for this it is sufficient that it commutes with the vertical derivative. This we check by Cartan calculus, using and , by assumption:
Now given an evolutionary vector field, we want to consider the flow that it induces on the space of field histories:
(flow of field histories along evolutionary vector field)
Let be a field bundle (def. 34) and let be an evolutionary vector field (def. 64) such that the ordinary flow of its prolongation (prop. 28)
exists on the jet bundle (e.g. if the order of derivatives of field coordinates that it depends on is bounded).
For a collection of field histories (hence a plot of the space of field histories (def. 33) ) the flow of through is the smooth function
whose unique factorization through the space of jets of field histories (i.e. the image of jet prolongation, def. 55)
takes a plot of the real line (regarded as a super smooth set via example 31), to the plot
of the smooth space of sections of the jet bundle.
(That indeed flows jet prolongations again to jet prolongations is due to its defining relation to the evolutionary vector field from prop. 28.)
(infinitesimal symmetries of the Lagrangian and conserved currents)
Let be a Lagrangian field theory (def. 60).
Then
an infinitesimal symmetry of the Lagrangian is an evolutionary vector field (def. 64) such that the Lie derivative of the Lagrangian density along its prolongation (prop. 28) is a total spacetime derivative:
an on-shell conserved current is a horizontal -form (def. 59) whose total spacetime derivative vanishes on the prolonged shell (48)
Let be a Lagrangian field theory (def. 60).
If is an infinitesimal symmetry of the Lagrangian (def. 66) with , then
is an on-shell conserved current (def. 66), for a presymplectic potential (50) from def. 22.
(Noether's theorem II is prop. 78 below.)
By Cartan's homotopy formula for the Lie derivative (prop. 3) and the decomposition of the variational derivative (46) and the fact that contraction with the prolongtion of an evolutionary vector field vanishes on horizontal differential forms (68) and anti-commutes with the horizontal differential (69), by def. 64, we may re-express the defining equation for the symmetry as follows:
which is equivalent to
Since, by definition of the shell , the differential form on the right vanishes on this yields the claim.
(energy-momentum of the scalar field)
Consider the Lagrangian field theory of the free scalar field from def. 39:
For consider the vector field on the jet bundle given by
This describes infinitesimal translations of the fields in the direction of .
And this is an infinitesimal symmetry of the Lagrangian (def. 66), since
With the formula (60) for the presymplectic potential
it hence follows from Noether's theorem (prop. 30) that the corresponding conserved current (def. 66) is
This conserved current is called the energy-momentum tensor.
Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime in spacetime dimension (example 43)
Then the prolongation (prop. 28) of the evolutionary vector field (def. 64)
is an infinitesimal symmetry of the Lagrangian (def. 66). The conserved current that corresponds to this under Noether's theorem I (prop. 30) is
This is called the Dirac current.
In fact, due to the supergeometric nature of the Dirac field, the Dirac current is conserved even off-shell, as discussed in remark 17.
Since an infinitesimal symmetry of a Lagrangian (def. 66) by definition changes the Lagrangian only up to a total spacetime derivative, and since the Euler-Lagrange equations of motion by construction depend on the Lagrangian density only up to a total spacetime derivative (prop. 22), it is plausible that and infinitesimal symmetry of the Lagrangian preserves the equations of motion (47), hence the shell (49). That this is indeed the case is the statement of prop. 33 below.
To make the proof transparent, we now first introduce the concept of the evolutionary derivative (def. 68) below and then observe that in terms of these the Euler-Lagrange derivative is in fact a derivation (prop. 31).
For
a fiber bundle (def. 6), regarded as a field bundle (def. 34), and for
any other fiber bundle over the same base space (spacetime), we write
for the space of sections of the pullback of bundles of to the jet bundle (def. 54) along .
(Equivalently this is the space of differential operators from sections of to sections of , according to prop. 56. )
In (Olver 93, section 5.1, p. 288) the field dependent sections of def. 67, considered in local coordinates, are referred to as tuples of differential functions.
(source forms and evolutionary vector fields are field-dependent sections)
For a field bundle, write for its vertical tangent bundle (example 6) and for its dual vector bundle (def. 8), the vertical cotangent bundle.
Then the field-dependent sections of these bundles according to def. 67 are identified as follows:
the space contains the space of evolutionary vector fields (def. 64) as those bundle morphism which respect not just the projection to but also its factorization through :
contains the space of source forms (prop. 22) as those bundle morphisms which respect not just the projection to but also its factorization through :
This makes manifest the duality pairing between source forms and evolutionary vector fields
which in local coordinates is given by
for smooth functions on the jet bundle (as in prop. 19).
(evolutionary derivative of field-dependent section)
Let
be a fiber bundle regarded as a field bundle (def. 34) and let
be a vector bundle (def. 7). Then for
a field-dependent section of according to def. 67, its evolutionary derivative is the morphism
which, under the identification of example 56, sense an evolutionary vector field to the derivative of (example 5) along the prolongation tangent vector field of (prop. 28).
In the case that and are trivial vector bundles over Minkowski spacetime with coordinates and , respectively (example 9), then by (71) this is given by
This makes manifest that may equivalently be regarded as a -dependent differential operator (def. 56) from the vertical tangent bundle (def. 6) to , namely a bundle homomorphism over of the form
in that
(evolutionary derivative of Lagrangian function)
Over Minkowski spacetime (def. 23), let be a Lagrangian density (def. 60), with coefficient function regarded as a field-dependent section (def. 67) of the trivial real line bundle:
Then the formally adjoint differential operator (def. 58)
of its evolutionary derivative, def. 68, regarded as a -dependent differential operator from to and applied to the constant section
is the Euler-Lagrange derivative (47)
via the identification from example 56.
(Euler-Lagrange derivative is derivation via evolutionary derivatives)
Let be a vector bundle (def. 7) and write for its dual vector bundle (def. 8).
For field-dependent sections (def. 67)
and
we have that the Euler-Lagrange derivative (47) of their canonical pairing to a smooth function on the jet bundle (as in prop. 19) is the sum of the derivative of either one via the formally adjoint differential operator (def. 58) of the evolutionary derivative (def. 68) of the other:
It is sufficient to check this in local coordinates. By the product law for differentiation we have
(evolutionary derivative of Euler-Lagrange forms is formally self-adjoint)
Let be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23) and regard the Euler-Lagrange derivative
(from prop. 22) as a field-dependent section of the vertical cotangent bundle
as in example 56. Then the corresponding evolutionary derivative field-dependent differential operator (def. 68) is formally self-adjoint (def. 58):
(Olver 93, theorem 5.92) The following proof is due to Igor Khavkine.
By definition of the Euler-Lagrange form (def. 22) we have
Applying the variational derivative (def. 59) to both sides of this equation yields
It follows that for any two evolutionary vector fields the contraction (def. 13) of their prolongations and (def. 28) into the differential 2-form on the left is
by inspection of the definition of the evolutionary derivative (def. 68). Moreover, their contraction into the differential form on the right is
by the fact (prop. 28) that contraction with prolongations of evolutionary vector fields antio-commutes with the total spacetime derivative (69).
Hence the last two equations combined give
This is the defining condition for to be formally self-adjoint differential operator (def. 58).
Now we may finally prove that an infinitesimal symmetry of the Lagrangian is also an infinitesimal symmetry of the Euler-Lagrange equations of motion:
(infinitesimal symmetries of the Lagrangian are also infinitesimal symmetries of the equations of motion)
Let be a Lagrangian field theory. If an evolutionary vector field is an infinitesimal symmetry of the Lagrangian then the flow along its prolongation preserves the prolonged shell (49) in that the Lie derivative of the Euler-Lagrange form along vanishes on :
Notice that for any vector field the Lie derivative (prop. 3) of the Euler-Lagrange form differs from that of its component functions by a term proportional to these component functions, which by definition vanishes on-shell:
But the Lie derivative of the component functions is just their plain derivative. Therefore it is sufficient to show that
Now by Noether's theorem I (prop. 30) the condition for an infinitesimal symmetry of the Lagrangian implies that the contraction (def. 13) of the Euler-Lagrange form with the corresponding evolutionary vector field is a total spacetime derivative:
Since the Euler-Lagrange derivative vanishes on total spacetime derivative (example 50) also its application on the contraction on the left vanishes. But via example 56 that contraction is a pairing of field-dependent sections as in prop. 31. Hence we use this proposition to compute:
Here the first step is by prop. 31, the second step is by prop. 32 and the third step is (74).
Hence
where in the last line we used that on the prolonged shell and all its horizontal derivatives vanish, by definition.
As a corollary we obtain:
(flow along infinitesimal symmetry of the Lagrangian preserves on-shell space of field histories)
Let be a Lagrangian field theory (def. 60).
For an infinitesimal symmetry of the Lagrangian (def. 66) the flow on the space of field histories (example 16) that it induces by def. 65 preserves the space of on-shell field histories (from prop. 22):
By def. 61 a field history is on-shell precisely if its jet prolongation (def. 55) factors through the shell (48). Hence by def. 65 the statement is equivalently that the ordinary flow (prop. 3) of (def. 28) on the jet bundle preserves the shell. This in turn means that it preserves the vanishing locus of the Euler-Lagrange form , which is the case by prop. 33.
infinitesimal symmetries of the presymplectic potential current
Evidently Noether's theorem I in variational calculus (prop. 30) is the special case for horizontal -forms of a more general phenomenon relating symmetries of variational forms to forms that are closed up to a contraction. The same phenomenon applied instead to the presymplectic current yields the following:
(variational Lie derivative)
Let be a field bundle (def. 34) with jet bundle (def. 54).
For a vertical tangent vector field on the jet bundle (a variation def. 63) write
for the variational Lie derivative along , analogous to Cartan's homotopy formula (prop. 3) but defined in terms of the variational derivative (35) as opposed to the full de Rham differential.
Then for and two vertical vector fields, write
for the vector field whose contraction operator (def. 13) is given by
(infinitesimal symmetry of the presymplectic potential and Hamiltonian differential forms)
Let be a Lagrangian field theory (def. 60) with presymplectic potential current (50). Write for the shell (48).
Then:
An on-shell variation (def. 63) is an infinitesimal symmetry of the presymplectic current or Hamiltonian vector field if on-shell (def. 22) its variational Lie derivative along (def. 69) is a variational derivative:
for some variational form .
A Hamiltonian differential form (or local Hamiltonian current) is a variational form on the shell such that there exists a variation with
We write
for the space of pairs consisting of a Hamiltonian differential forms on-shell and a corresponding variation.
(Hamiltonian Noether's theorem)
A variation is an infinitesimal symmetry of the presymplectic potential (def. 70) with precisely if
is a Hamiltonian differential form for .
Since therefore both the conserved currents from Noether's theorem as well as the Hamiltonian differential forms are generators of infinitesimal symmetries of certain variational forms (namely of the Lagrangian density and of the presymplectic current, respectively) they form a Lie algebra. For the conserved currents this is sometimes known as the Dickey bracket Lie algebra. For the Hamiltonian forms it is the Poisson bracket Lie p+1-algebra. Since here for simplicity we are considering just vertical variations, we have just a plain Lie algebra. The transgression of this Lie algebra of Hamiltonian forms on the jet bundle to Cauchy surfaces yields a presymplectic structure on phase space, this we discuss below.
Let be a Lagrangian field theory (def. 60).
On the space pairs of Hamiltonian differential forms with compatible variation (def. 70) the following operation constitutes a Lie bracket:
where is the variational Lie bracket from def. 69.
We call this the local Poisson Lie bracket.
First we need to check that the bracket is well defined in itself. It is clear that it is linear and skew-symmetric, but what needs proof is that it does indeed land in , hence that the following equation holds:
With def. 69 for and we compute this as follows:
This shows that the bracket is well defined.
It remains to see that the bracket satifies the Jacobi identity:
hence that
Here holds because by def. 69 acts as a derivation, and hence what remains to be shown is that
We check this by repeated uses of def. 69, using in addition that
(since by being Hamiltonian)
(since in addition )
(since is of vertical degree 2, and since all variations are vertical by assumption).
So we compute as follows (a special case of FRS 13b, lemma 3.1.1):
The local Poisson bracket Lie algebra from prop. 36 is but the lowest stage of a higher Lie theoretic structure called the Poisson bracket Lie p-algebra. Here we will not go deeper into this higher structure (see at Higher Prequantum Geometry for more), but below we will need the following simple shadow of it:
The horizontally exact Hamiltonian forms constitute a Lie ideal for the local Poisson Lie bracket (76).
Let be a horizontally exact Hamiltonian form, hence
for some . Write for a Hamiltonian vector field for .
Then for any other pair consisting of a Hamiltonian form and a corresponding Hamiltonian vector field, we have
Here we used that the horizontal derivative anti-commutes with the vertical one by construction of the variational bicomplex, and that anti-commutes with the horizontal derivative since the variation (def. 63) is by definition vertical.
(local Poisson bracket for real scalar field)
Consider the Lagrangian field theory for the free real scalar field from example 39.
By example 45 its presymplectic current is
The corresponding local Poisson bracket algebra (prop. 36) has in degree 0 Hamiltonian forms (def. 35) such as
and
The corresponding Hamiltonian vector fields are
and
Hence the corresponding local Poisson bracket is
More generally for two bump functions then
(local Poisson bracket for free Dirac field)
Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime (example 43), whose presymplectic current is, according to example 49, given by
Consider this specifically in spacetime dimension in which case the components are complex number-valued (by prop./def. 10), so that the tuple amounts to 8 real-valued coordinate functions. By changing complex coordinates, we may equivalently consider as four coordinate functions, and as another four independent coordinate functions.
Using this coordinate transformation, it is immediate to find the following pairs of Hamiltonian vector fields and their Hamiltonian differential forms from def. 70 applied to (77)
| Hamiltonian vector field | Hamiltonian differential form |
|---|---|
and to obtain the following non-trivial local Poisson brackets (prop. 36) (the other possible brackets vanish):
Notice the signs: Due to the odd-grading of the field coordinate function , its variational derivative has bi-degree and the contraction operation has bi-degree , so that commuting it past picks up two minus signs, a “cohomological” sign due to the differential form degrees, and a “supergeometric” one (def. 47):
For the same reason, the local Poisson bracket is a super Lie algebra with symmetric super Lie bracket:
This concludes our discussion of general infinitesimal symmetries of a Lagrangian. We pick this up again in the discussion of Gauge symmetries below. First, in the next chapter we discuss the concept of observables in field theory.
Given a Lagrangian field theory (def. 39), then a general observable quantity or just observable for short (def. 71 below), is a smooth function
on the on-shell space of field histories (example 16, example 33) hence a smooth “functional” of field histories. We think of this as assigning to each physically realizable field history the value of the given quantity as exhibited by that field history. For instance concepts like “average field strength in the compact spacetime region ” should be observables. In particular the field amplitude at spacetime point should be an observable, denoted .
In much of the literature on field theory, these point evaluation observables (example below 60) are eventually referred to as “fields” themselves, blurring the distinction between
field species/field bundles ,
functions on the space of field histories .
In particular, the process of quantization (discussed in Quantization below) affects the third of these concepts only, in that it deforms the algebra structure on observables to a non-commutative algebra of quantum observables. For this reason the observables are often referred to as quantum fields. But to understand the conceptual nature of quantum field theory it is important that the are really the observables or quantum observables on the space of field histories.
| aspect | term | type | description | def. |
|---|---|---|---|---|
| field component | , | coordinate function on jet bundle of field bundle | def. 34, def. 54 | |
| field history | , | jet prolongation of section of field bundle | def. 34, def. 55 | |
| field observable | , | derivatives of delta-functional on space of sections | def. 71, example 60 | |
| averaging of field observable | observable-valued distribution | def. 80 | ||
| algebra of quantum observables | non-commutative algebra structure on field observables | def. 127, def. 132 |
There are various further conditions on observables which we will eventually consider, forming subspaces of gauge invariant observables (def. 98), local observables (def. 83 below), Hamiltonian local observables (def. 89 below) and microcausal observables (def. 126). While in the end it is only these special kinds of observables that matter, it is useful to first consider the unconstrained concept and then consecutively characterize smaller subspaces of well-behaved observables. In fact it is useful to consider yet more generally the observables on the full space of field histories (not just the on-shell subspace), called the off-shell observables.
In the case that the field bundle is a vector bundle (example 9), the off-shell space of field histories is canonically a vector space and hence it makes sense to consider linear off-shell observables, i.e. those observables with and . It turns out that these are precisely the compactly supported distributions in the sense of Laurent Schwartz (prop. 37 below). This fact makes powerful tools from functional analysis and microlocal analysis available for the analysis of field theory (discussed below).
More generally there are the multilinear off-shell observables, and these are analogously given by distributions of several variables (def. 76 below). In fully perturbative quantum field theory one considers only the infinitesimal neighbourhood (example 27) of a single on-shell field history and in this case all observables are in fact given by such multilinear observables (def. 84 below).
For a free field theory (def. 62) whose Euler-Lagrange equations of motion are given by a linear differential operator which behaves well in that it is “Green hyperbolic” (def. 79 below) it follows that the actual on-shell linear observables are equivalently those off-shell observables which are spatially compactly supported distributional solutions to the formally adjoint equation of motion (prop. 43 below); and this equivalence is exhibited by composition with the causal Green function (def. 78 below):
This is theorem 1 below, which is pivotal for passing from classical field theory to quantum field theory:
This fact makes, in addition, the distributional analysis of linear differential equations available for the analysis of free field theory, notably the theory of propagators, such as Feynman propagators (def. 108 below), which we turn to in Propagators below.
The functional analysis and microlocal analysis (below) of linear observables re-expressed in distribution theory via theorem 1 solves the issues that the original formulation of perturbative quantum field theory by Schwinger-Tomonaga-Feynman-Dyson in the 1940s was notorious for suffering from (Feynman 85): The normal ordered product of quantum observables in a Wick algebra of observables follows from Hörmander's criterion for the product of distributions to be well-defined (this we discuss in Free quantum fields below) and the renormalization freedom in the construction of the S-matrix is governed by the mechanism of extensions of distributions (this we discuss in Renormalization below).
Among the polynomial on-shell observables characterized this way, the focus is furthermore on the local observables:
In local field theory the idea is that both the equations of motion as well as the observations are fully determined by their restriction to infinitesimal neighbourhoods of spacetime points (events). For the equations of motion this means that they are partial differential equations as we have seen above. For the observables it should mean that they must be averages over regions of spacetime of functions of the value of the field histories and their derivatives at any point of spacetime. Now a “smooth function of the value of the field histories and their derivatives at any point” is precisely a smooth function on the jet bundle of the field bundle (example 54) pulled back via jet prolongation (def. 55). If this is to be averaged over spacetime it needs to be the coefficient of a horizontal -form (prop. 59).
In mathematical terminology these desiderata say that the local observables in a local field theory should be precisely the “transgressions” (def. 82 below) of horizontal variational -forms (with compact spacetime support, def. 81 below) to the space of field histories (example 16). This is def. 83 below.
A key example of a local observable in Lagrangian field theory (def. 60) is the action functional (example 66 below). This is the transgression of the Lagrangian density itself, or rather of its product with an “adiabatic switching function” that localizes its support in a compact spacetime region. In typical cases the physical quantity whose observation is represented by the action functional is the difference of the kinetic energy-momentum minus the potential energy of a field history averaged over the given region of spacetime.
The equations of motion of a Lagrangian field theory say that those field histories are physically realized which are critical points of this action functional observable. This is the principle of extremal action (prop. 45 below).
This formalizes what it means for a field history to be “realizable” (physically admissible) (a solution to the Euler-Lagrange equations, def. 61) and what the (local) observable quantities on field histories are (def. 83). It remains to formalize what it means for the physical system to be in some definite state so that the observable quantities take some definite value, reflecting the properties of that state.
Whatever formalization for states of a field theory one considers, at the very least the space of states should come with a pairing linear map
which reads in an observable quantity and a state, to be denoted , and produces the complex number which is the “value of the observable quantity in the case that the physical system is in the state ”.
One might imagine that it is fundamentally possible to pinpoint the exact field history that the physical system is found in. From this perspective, fixing a state should simply mean to pick such a field history, namely an element in the on-shell space of field histories. If we write for this state, its pairing map with the observables would simply be evaluation of the observable, being a function on the field history space, on that particular element in this space:
However, in the practice of experiment a field history can never be known precisely, without remaining uncertainty. Moreover, quantum physics (to which we finally come below), suggests that this is true not just in practice, but even in principle. Therefore we should allow states to be a kind of probability distributions on the space of field histories, and regard the pairing of a state with an observable as a kind of expectation value of the function averaged with respect to this probability distribution. Specifically, if the observable quantity is (a smooth approximation to) a characteristic function of a subset of the space of field histories, then its value in a given state should be the probability to find the physical system in that subset of field histories.
But, moreover, the superposition principle of quantum physics says that the actually observable observables are only those of the form (for the image under the star-operation on the star algebra of observables.
This finally leads to the definition of states in def. 86 below.
We now discuss these topics:
General observables
Let be a Lagrangian field theory (def. 60) with its on-shell space of field histories (def. 61).
Then the space of observables is the super formal smooth set (def. 48) which is the mapping space
from the on-shell space of field histories to the complex numbers.
Similarly there is the space of off-shell observables
Every off-shell observables induces an on-shell observable by restriction, this yields a smooth function
similarly we may consider the observables on the sup-spaces of field histories with restricted causal support according to def. 31. We write
and
for the spaces of (off-shell) observables on field histories with spatially compact support (def. 31).
Observables on bosonic fields
In the case that is a purely bosonic field bundle in smooth manifolds so that is a diffeological space (def. 16, def. 61) this means that a single observable is equivalently a smooth function (def. 35)
Explicitly, by def. 36 (and similarly by def. 48) this means that is for each Cartesian space (generally: super Cartesian space, def. 46) a natural function of plots
Observables on fermionic fields
In the case that has purely fermionic fibers (def. 50), such as for the Dirac field (example 35) with then the only point in is the zero-observable, instead an observable is now a morphism
and its component is a bosonic observable as above.
The most basic kind of observables are the following:
(point evaluation observables – field observables)
Let be a Lagrangian field theory (def. 60) whose field bundle (def. 34) over some spacetime happens to be a trivial vector bundle in even degree (i.e. bosonic) with field fiber coordinates (example 9). With respect to these coordinates a field history, hence a section of the field bundle
has components which are smooth functions on spacetime.
Then for every index and every point in spacetime (every event) there is an observable (def. 71) denoted which is given by
hence which on a test space (a Cartesian space or more generally super Cartesian space, def. 46) sends a -parameterized collection of fields
to their -parameterized collection of values at of their -th component.
Notice how the various aspects of the concept of “field” are involved here, all closely related but crucially different:
Polynomial off-shell Observables and Distributions
We consider here linear observables (def. 72 below) and more generally quadratic observables (def. 75) and generally polynomial observables (def. 76 below) for free field theories and discuss how these are equivalently given by integration against generalized functions called distributions (prop. 37 and prop. 38 below).
This is the basis for the discussion of quantum observables for free field theories further below.
(linear off-shell observables)
Let be a Lagrangian field theory (def. 60) whose field bundle (def. 34) is a super vector bundle (as in example 9 and as opposed to more general non-linear fiber bundles).
This means that the off-shell space of field histories (example 33) inherits the structure of a super vector space by spacetime-pointwise (i.e. event-wise) scaling and addition of field histories.
Then an off-shell observable (def. 71)
is a linear observable if it is a linear function with respect to this vector space structure, hence if
for all plots of field histories .
We write
for the subspace of linear observables inside all observables (def. 71) and similarly
for the linear off-shell observables inside all off-shell observables, and similarly for the subspaces of linear oobservables on field histories of spatially compact supprt (79):
and
(point evaluation observables are linear)
Let be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle (def. 34) is the trivial vector bundle with field coordinates (example 9).
Then for each field component index and point of spacetime (each event) the point evaluation observable (example 60)
is a linear observable according to def. 72. The distribution that it corresponds to under prop. 37 is the Dirac delta-distribution at the point combined with the Kronecker delta on the index : In the generalized function-notation of remark 19 this reads:
(linear off-shell observables of scalar field are the compactly supported distributions)
Let be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle (def. 34) is the trivial real line bundle (as for the real scalar field, example 10). This means that the off-shell space of field histories (19) is the real vector space of smooth functions on Minkowski spacetime and that every linear observable (def. 72) gives a linear function
This linear function is in fact a compactly supported distribution, in the sense of functional analysis, in that it satisfies the following Fréchet vector space continuity condition:
Fréchet continuous linear functional
A linear function is called continuous if there exists
such that for all on-shell field histories
the following inequality of absolute values of partial derivatives holds
where the sum is over all multi-indices (1) whose total degree is bounded by , and where
denotes the corresponding partial derivative (1).
This identification constitutes a linear isomorphism
saying that all compactly supported distributions arise from linear off-shell observables of the scalar field this way, and uniquely so.
For proof see at distributions are the smooth linear functionals, this prop.
The identification from prop. 37 of linear off-shell observables with compactly supported distributions makes available powerful tools from functional analysis. The key fact is the following:
(distributions are generalized functions)
For , every compactly supported smooth function on the Cartesian space induces a distribution (prop. 37), hence a continuous linear functional, by integration against times the volume form.
The distributions arising this way are called the non-singular distributions.
This construction is clearly a linear inclusion
and in fact this is a dense subspace inclusion for the space of compactly supported distributions equipped with the dual space topology (this def.) to the Fréchet space structure on from prop. 37.
Hence every compactly supported distribution is the limit of a sequence of compactly supported smooth functions in that for every smooth function we have that the value is the limit of integrals against :
(e. g. Hörmander 90, theorem 4.1.5)
Proposition 38 with prop. 37 implies that with due care we may think of all linear off-shell observables as arising from integration of field histories against some “generalized smooth functions” (namely a limit of actual smooth functions):
(linear off-shell observables of real scalar field as integration against generalized functions)
Let be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), whose field bundle (def. 34) is a trivial vector bundle with field coordinates .
Prop. 37 implies immediately that in this situation linear off-shell observables (def. 72) correspond to tuples of compactly supported distributions via
With prop. 38 it follows furthermore that there is a sequence of tuples of smooth functions such that is the limit of the integrations against these:
where now the sum over the index is again left notationally implicit.
For handling distributions/linear off-shell observables it is therefore useful to adopt, with due care, shorthand notation as if the limits of the sequences of smooth functions actually existed, as “generalized functions” , and to set
This suggests that basic operations on functions, such as their pointwise product, should be extended to distributions, e.g. to a product of distributions. This turns out to exist, as long as the high-frequency modes in the Fourier transform of the distributions being multiplied cancel out – the mathematical reflection of “UV-divergences” in quantum field theory. This we turn to in Free quantum fields below.
These considerations generalize from the field bundle of the real scalar field to general field bundles (def. 34) as long as they are smooth vector bundles (def. 7):
(Fréchet topological vector space on spaces of smooth sections of a smooth vector bundle)
Let be a field bundle (def. 34) which is a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23); hence, up to isomorphism, a trivial vector bundle as in example 9.
On its real vector space of smooth sections consider the seminorms indexed by a compact subset and a natural number and given by
where on the right we have the absolute values of the partial derivatives of index by (1) with respect to any choice of norm on the fibers.
This makes a Fréchet topological vector space.
For any closed subset then the sub-space of sections
of sections whose support is inside becomes a Fréchet topological vector spaces with the induced subspace topology, which makes these be closed subspaces.
Finally, the vector spaces of smooth sections with prescribed causal support (def. 31) are inductive limits of vector spaces as above, and hence they inherit topological vector space structure by forming the corresponding inductive limit in the category of topological vector spaces. For instance
etc.
Let be a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23).
The vector spaces of smooth sections with restricted support from def. 31 structures of topological vector spaces via def. 73. We denote the dual topological vector spaces by
This is called the space of distributional sections of the bundle .
The support of a distributional section is the set of points in such that for every neighbourhood of that point does not vanish on all sections with support in that neighbourhood.
Imposing the same restrictions to the supports of distributional sections as in def. 31, we have the following subspaces of distributional sections:
(Sanders 13, Bär 14)
As before in prop. 38 the actual smooth sections yield examples of distributional sections, and all distributional sections arise as limits of integrations against smooth sections:
(non-singular distributional sections)
Let be a smooth vector bundle over Minkowski spacetime and let be any of the support conditions from def. 31.
Then the operation of regarding a compactly supported smooth section of the dual vector bundle as a functional on sections with this support property is a dense subspace inclusion into the topological vector space of distributional sections from def. 74:
(distribution dualities with causally restricted supports)
Let be a smooth vector bundle (def. 7) over Minkowski spacetime (def. 23).
Then there are the following isomorphisms of topological vector spaces between a) dual spaces of spaces of sections with restricted causal support (def. 31) and equipped with the topology from def. 73 and b) spaces of distributional sections with restricted supports, according to def. 74:
(Sanders 13, thm. 4.3, Bär 14, lem. 2.14)
The concept of linear observables naturally generalizes to that of multilinear observables:
(quadratic off-shell observables)
Let be a Lagrangian field theory (def. 60) over a spacetime whose field bundle (def. 34) is a super vector bundle.
The external tensor product of vector bundles of the field bundle with itself, denoted
is the vector bundle over the Cartesian product , of spacetime with itself, whose fiber over a pair of points is the tensor product of the corresponding field fibers.
Given a field history, hence a section of the field bundle, there is then the induced section .
We say that an off-shell observable
is quadratic if it comes from a “bilinear observable”, namely a smooth function on the space of sections of the external tensor product of the field bundle with itself
as
More explicitly: By prop. 37 the quadratic observable is given by a compactly supported distribution of two variables which in the notation of remark 19 comes from a matrix of generalized functions as
This notation makes manifest how the concept of quadratic observables is a generalization of that of quadratic forms coming from bilinear forms.
(polynomial off-shell observables)
Let be a Lagrangian field theory (def. 60) over a spacetime whose field bundle (def. 34) is a super vector bundle.
An off-shell observable (def. 71)
is polynomial if it is the sum of a constant, and a linear observable (def. 72), and a quadratic observable (def. 75) and so on:
In summary, the above establishes that the Schwartz theory of (compactly supported) distributions neatly applies to characterize smooth polynomial observables on the diffeological space of field histories for a field bundle which is a vector bundle.
Polynomial on-shell Observables and Distributional solutions to PDEs
While every off-shell observable induces an on-shell observables simply by restriction (78), different off-shell observables may restrict to the same on-shell observale. It is therefore useful to find a condition on off-shell observables that makes them equivalent to on-shell observables under restriction. Here we discuss in the case of sufficiently well behaved free field equations of motion – namely Green hyperbolic differential equations, def. 79 below – that this on-shell condition on the linear off-shell observables (def. 72) is that they are distributional solutions to the formal adjoint to the equations of motion, under their identification with distributions via prop. 37.
While in general the equations of motion are not Green hyperbolic – namely not in the presence of implicit infinitesimal gauge symmetries discussed in Gauge symmetries below – it turns out that up to a suitable notion of equivalence they are equivalent to those that are, this we discuss in Gauge fixing below.
(derivatives of distributions and distributional solutions of PDEs)
Given a pair of formally adjoint differential operators (def. 58) then the distributional derivative of a distributional section (def. 74) by is the distributional section
If
then we say that is a distributional solution (or generalized solution) of the homogeneous differential equation defined by .
(ordinary PDE solutions are generalized solutions)
Let be a smooth vector bundle over Minkowski spacetime and let be a pair of formally adjoint differential operators.
Then for every non-singular distributional section coming from an actual smooth section via prop. 39 the derivative of distributions (def. 77) is the distributional section induced from the ordinary derivative of smooth functions:
In particular is a distributional solution to the PDE precisely if is an ordinary solution:
For all we have
where all steps are by the definitions except the third, which is by the definition of formally adjoint differential operator (def. 58), using that by the compact support of and the Stokes theorem (prop. 4) the term in def. 58 does not contribute to the integral.
(advanced and retarded Green functions and causal Green function)
Let be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23). Let be a differential operator (def. 56) on its space of smooth sections.
Then a linear map
from spaces of smooth sections of compact support to spaces of sections of causally sourced future/past support (def. 31) is called an advanced or retarded Green function for , respectively, if
for all we have
and
the support of is in the closed future cone or closed past cone of the support of , respectively.
If the advanced/retarded Green functions exists, then the difference
is called the causal Green function.
(e.g. Bär 14, def. 3.2, cor. 3.10)
(Green hyperbolic differential equation)
Let be a field bundle (def. 34) which is a vector bundle (def. 7) over Minkowski spacetime (def. 23).
A differential operator (def. 57)
is called a Green hyperbolic differential operator if as well as its formal adjoint differential operator (def. 58) admit advanced and retarded Green functions (def. 78).
(Bär 14, def. 3.2, Khavkine 14, def. 2.2)
The two archtypical examples of Green hyperbolic differential equations are the Klein-Gordon equation and the Dirac equation on Minkowski spacetime. For the moment we just cite the existence of the advanced and retarded Green functions for these, we will work these out in detail below in Propagators.
(Klein-Gordon equation is a Green hyperbolic differential equation)
The Klein-Gordon equation, hence the Euler-Lagrange equation of motion of the free scalar field (example 25) is a Green hyperbolic differential equation (def. 79) and formally self-adjoint (example 51).
(e. g. Bär-Ginoux-Pfaeffle 07, Bär 14, example 3.3)
(Dirac operator is Green hyperbolic)
The Dirac equation, hence the Euler-Lagrange equation of motion of the massive free Dirac field (example 52) is a Green hyperbolic differential equation (def. 79) and formally anti self-adjoint (example 53).
(Bär 14, corollary 3.15, example 3.16)
(causal Green functions of formally adjoint Green hyperbolic differential operators are formally adjoint)
Let
be a pair of Green hyperbolic differential operators (def. 79) which are formally adjoint (def. 58). Then also their causal Green functions and (def. 78) are formally adjoint differential operators, up to a sign:
We did not require that the advanced and retarded Green functions of a Green hyperbolic differential operator are unique; in fact this is automatic:
(advanced and retarded Green functions of Green hyperbolic differential operator are unique)
The advanced and retarded Green functions (def. 78) of a Green hyperbolic differential operator (def. 79) are unique.
Moreover we did not require that the advanced and retarded Green functions of a Green hyperbolic differential operator come from integral kernels (“propagators”). This, too, is automatic:
(causal Green functions of Green hyperbolic differential operators are continuous linear maps)
Given a Green hyperbolic differential operator (def. 79), the advanced, retarded and causal Green functions of (def. 78) are continuous linear maps with respect to the topological vector space structure from def. 73 and also have a unique continuous extension to the spaces of sections with larger support (def. 31) as follows:
such that we still have the relation
and
and
By the Schwartz kernel theorem the continuity of implies that there are integral kernels
such that, in the notation of generalized functions,
These integral kernels are called the advanced and retarded propagators. Similarly the combination
is called the causal propagator.
We now come to the main theorem on polynomial observables:
(exact sequence of Green hyperbolic differential operator)
Let be a Green hyperbolic differential operator (def. 79) with causal Green function (def. 79). Then the sequences
of these operators restricted to functions with causally restricted supports as indicated (def. 31) are exact sequences of topological vector spaces and continuous linear maps between them.
Under passing to dual spaces and using the isomorphisms of spaces of distributional sections (def. 74) from prop. 40 this yields the following dual exact sequence of topological vector spaces and continuous linear map between them:
This is due to Igor Khavkine, based on (Khavkine 14, prop. 2.1); for proof see at Green hyperbolic differential operator this lemma.
(on-shell space of field histories for Green hyperbolic free field theories)
Let be a free field theory Lagrangian field theory (def. 43) whose Euler-Lagrange equation of motion is Green hyperbolic (def. 79).
Then the on-shell space of field histories (or of field histories with spatially compact support, def. 31) is, as a vector space, linearly isomorphic to the quotient space of compactly supported sections (or of temporally compactly supported sections, def. 31) by the image of the differential operator , and this isomorphism is given by the causal Green function (83)
This is a direct consequence of the exactness of the sequence (85) in lemma 3.
We spell this out for the statement for , which follows from the first line in (85), the first statement similarly follows from the second line of (85):
First the on-shell space of field histories is the kernel of , by definition of free field theory (def. 43)
Second, exactness of the sequence (85) at means that the kernel of equals the image . But by exactness of the sequence at it follows that becomes injective on the quotient space . Therefore on this quotient space it becomes an isomorphism onto its image.
Under passing to dual vector spaces, the linear isomorphism in corollary 1 in turn yields linear isomorphisms of the form
Except possibly for the issue of continuity this says that the linear on-shell observables (def. 72) of a Green hyperbolic free field theory are equivalently those linear off-shell observables which are generalized solutions of the formally dual equation of motion according to def. 77.
That this remains true also for topological vector space structure follows with the dual exact sequence (86). This is the statement of prop. 43 below.
(distributional sections on a Green hyperbolic solution space are the generalized PDE solutions)
Let be a pair of Green hyperbolic differential operators (def. 79) which are formally adjoint (def. 58).
Then a continuous linear functional on the solution space
is equivalently a distributional section (def. 74) whose support is spacelike compact (def. 31, prop. 40)
and which is a distributional solution (def. 77) to the differential equation
Similarly, a continuous linear functional on the subspace of solutions that have spatially compact support (def. 31)
is equivalently a distributional section (def. 74) without constraint on its distributional support
and which is a distributional solution (def. 77) to the differential equation
Moreover, these linear isomorphisms are both given by composition with the causal Green function (def. 78):
This follows from the exact sequence in lemma 3. For details of the proof see at Green hyperbolic differential operator this prop., due to Igor Khavkine.
In conclusion we have found the following:
(linear observables of Green free field theory are the distributional solutions to the formally adjoint equations of motion)
Let be a Lagrangian free field theory (def. 62) which is a free field theory (def. 62) whose Euler-Lagrange differential equation of motion (def. 61) is Green hyperbolic (def. 79), such as the Klein-Gordon equation (example 63) or the Dirac equation (example 64). Then:
The linear off-shell observables (def. 72) are equivalently the compactly supported distributional sections (def. 74) of the dual vector bundle (def. 8) of the field bundle:
The linear on-shell observables (def. 72) are equivalently those spacelike compactly supported compactly distributional sections (def. 74) which are distributional solutions of the formally adjoint equations of motion (def. 58), and this isomorphism is exhibited by precomposition with the causal propagator :
Similarly the linear on-shell observables on spacelike compactly supported on-shell field histories (79) are equivalently the distributional solutions without constraint on their support:
The first statement follows with prop. 37 applied componentwise. The same proof applies verbatim to the subspace of solutions, showing that , with the dual topological vector space on the right. With this the second statement follows by prop. 43.
We will be interested in those linear observables which under the identification from theorem 1 correspond to the non-singular distributions (because on these the Poisson-Peierls bracket of the theory is defined, theorem 2 below):
(regular linear field observables and observable-valued distributions)
Let be a free Lagrangian field theory (def. 62) whose Euler-Lagrange equations of motion (prop. 81) is Green hyperbolic (def. 79).
Define the regular linear field observables among the linear on-shell observables (def. 72) to be the non-singular distributions on the on-shell space of field histories, hence the image
of the map
By lemma 3 every is in the image of and by example 65 this implies that the kernel of this map is the image of :
The point-evaluation field observables (example 60) are linear observables (example 61) but far from being regular (89) (except in spacetime dimension ). But the regular observables are precisely the averages (“smearings”) of these point evaluation observables against compactly supported weights.
Viewed this way, the defining inclusion of the regular linear observables (89) is itself an observable valued distribution
which to a “smearing function” assigns the observable which is the field observable smeared by (i.e. averaged against) that smearing function.
Below in Free quantum fields we discuss how the polynomial Poisson algebra of regular polynomial observables of a free field theory may be deformed to a non-commutative algebra of quantum observables. Often this may be represented by linear operators acting on some Hilbert space. In this case then above becomes a continuous linear functional from to a space of linear operators on some Hilbert space. As such it is then called an operator-valued distribution.
Local observables
We now discuss the sub-class of those observables which are “local”.
(spacetime support)
Let be a field bundle over a spacetime (def. 34), with induced jet bundle
For every subset let
be the corresponding restriction of the jet bundle of .
The spacetime support of a differential form on the jet bundle of is the topological closure of the maximal subset such that the restriction of to the jet bundle restrited to this subset vanishes:
We write
for the subspace of differential forms on the jet bundle whose spacetime support is a compact subspace.
(transgression of variational differential forms to space of field histories)
Let be a field bundle over a spacetime (def. 34). and let
be a submanifold of spacetime of dimension . Recall the space of field histories restricted to its infinitesimal neighbourhood, denoted (def. 34).
Then the operation of transgression of variational differential forms to is the linear map
that sends a variational differential form to the differential form (def. 37, example 32) which to a smooth family on field histories
assigns the differential form given by first forming the pullback of differential forms along the family of jet prolongation followed by the integration of differential forms over :
(transgression to dimension picks out horizontal -forms)
In def. 82 we regard integration of differential forms over as an operation defined on differential forms of all degrees, which vanishes except on forms of degree , and hence transgression of variational differential forms to vanishes except on the subspace
of forms of horizontal degree .
(adiabatically switched action functional)
Given a field bundle , consider a local Lagrangian density (def. 60)
For any bump function , the transgression of (def. 82) is called the action functional
induced by , “adiabatically switched” by .
Specifically if the field bundle is a trivial vector bundle as in example 9, such that the Lagrangian density may be written in the form
then its action functional takes a field history to the value
(transgression compatible with variational derivative)
Let be a field bundle over a spacetime (def. 34) and let be a submanifold possibly with boundary . Write
for the boundary restriction map.
Then the operation of transgression of variational differential forms (def. 82)
is compatible with the variational derivative and with the total spacetime derivative in the following way:
On variational forms that are in the image of the total spacetime derivative a transgressive variant of the Stokes' theorem (prop. 4) holds:
Transgression intertwines, up to a sign, the variational derivative on variational differential forms with the plain de Rham differential on the space of field histories:
Regarding the first statement, consider a horizontally exact variational form
By prop. 20 the pullback of this form along the jet prolongation of fields is exact in the -direction:
(where we write for the de Rham differential on ). Hence by the ordinary Stokes' theorem (prop. 4) restricted to any with restriction the relation
Regarding the second statement: by the Leibniz rule for de Rham differential (product law of differentiation) it is sufficient to check the claim on variational derivatives of local coordinate functions
The pullback of differential forms (prop. 2) along the jet prolongation has two contributions: one from the variation along , the other from variation along :
By prop. 20, for fixed the pullback of along the jet prolongation vanishes.
For fixed , the pullback of the full de Rham differential is
(since the full de Rham differentials always commute with pullback of differential forms by prop. 2), while the pullback of the horizontal derivative vanishes at fixed .
This implies over the given smooth family that
and since this holds covariantly for all smooth families , this implies the claim.
(variation of the action functional)
Given a Lagrangian field theory (def. 60) then the derivative of its adiabatically switched action functional (def. 66) equals the transgression of the Euler-Lagrange variational derivative (def. 22):
By the second statement of prop. 44 we have
Moreover, by prop. 22 this is
where the second term vanishes by the first statement of prop. 44.
(principle of extremal action)
Let be a Lagrangian field theory (def. 60).
The de Rham differential of the action functional (example 67) vanishes at a field history
for all adiabatic switchings constant on some subset (def. 33) on those smooth collections of field histories
around which, as functions on , are constant outside (example 16, example 33) precisely if solves the Euler-Lagrange equations of motion (def. 61):
By prop. 44 we have
By the assumption on it follows that after pullback to the switching function is constant, so that it commutes with the differentials:
This vanishes at for all precisely if all components of vanish, which is the statement of the Euler-Lagrange equations of motion.
Given a Lagrangian field theory (def. 60) with on-shell space of histories (62) then the space
of observables is simply the space of complex-valued smooth functions
on the on-shell space of field histories (62). This is a star-algebra under pointwise complex conjugation.
(That we consider functions with values in complex numbers instead of real numbers is a reflection of the superposition principle in quantum physics, more about this below.)
On the other hand the local observables are the horizontal p+1-forms
of compact spacetime support (def. 81)
modulo total spacetime derivatives
which we may identify with the subspace of all observables (92) on those that arise as the image under transgression of variational differential forms (def. 82) of local observables to functionals on the on-shell space of field histories (62):
This is a sub-vector space inside all observables which is in general not closed under the product of functions. We write
for the smallest subalgebra of observables, under the pointwise product, that contains all the local observables. This is called the algebra of multilocal observables.
maybe better consider formal power series of observables around a background solution
(local observables of the real scalar field)
Consider the field bundle of the real scalar field (example 10).
A typical example of local observables (def. 83) in this case is the “field amplitude averaged over a given spacetime region” determined by a bump function . On an on-shell field history this observable takes as value the integral
(local observables of the electromagnetic field)
Consider the field bundle for free electromagnetism on Minkowski spacetime .
Then for a bump function on spacetime, the transgression of the universal Faraday tensor (def. 36) against times the volume form is a local observable (def. 83), namely the field strength (20) of the electromagnetic field averaged over spacetime.
Infinitesimal observables
The definition of observables in def. 71 and specifically of local observables in def. 83 uses explicit restriction to the shell, hence, by the principle of extremal action (prop. 45) to the “critical locus” of the action functional. Such critical loci are often hard to handle explicitly. It helps to consider a “homological resolution” that is given, in good circumstances, by the corresponding “derived critical locus”. These we consider in detail below in Reduced phase space. In order to have good control over these resolutions, we here consider the first perturbative aspect of field theory, namely we consider the restriction of local observables to just an infinitesimal neighbourhood of a background on-shell field history:
(local observables around infinitesimal neighbourhood of background on-shell field history)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 24). Let be a constant section of the shell (56) as in example 24.
Then we write
for the restriction of the local observables (def. 83) to the fiberwise infinitesimal neighbourhood (example 27) of .
Explicitly, this means the following:
First of all, by prop. 19 the dependence of the Lagrangian density on the order of field derivatives is bounded by some on some neighbourhood of and hence, by the spacetime independence of , on some neighbourhood of .
Therefore we may restrict without loss to the order- jets. By slight abuse of notation we still write
for the corresponding shell. It follows then that the restriction of the ring of smooth functions on the jet bundle to the infinitesimal neighbourhood (example 27) is equivalently the formal power series ring over in the variables
We denote this by
A key consequence is that the further restriction of this ring to the shell (49) is now simply the further quotient ring by the ideal generated by the total spacetime derivatives of the components of the Euler-Lagrange form (prop. 22).
Finally the local observables restricted to the infinitesimal neighbourhood is the module
The space of local observables in def. 84 is the quotient of a formal power series algebra by the components of the Euler-Lagrange form and by the image of the horizontal spacetime de Rham differential. It is convenient to also conceive of the components of the Euler-Lagrange form as the image of a differential, for then the algebra of local observables obtaines a cohomological interpretation, which will lend itself to computation. This differential, whose image is the components of the Euler-Lagrange form, is called the BV-differential. We introduce this now first (def. 85 below) in a direct ad-hoc way. Further below we discuss the conceptual nature of this differential as part of the construction of the reduced phase space as a derived critical locus (example 101 below).
(local BV-complex of ordinary Lagrangian density)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 84). Let be a constant section of the shell (56).
In correspondence with def. 84, write
for the restriction of vertical vector fields on the jet bundle to the fiberwise infinitesimal neighbourhood (example 27) of .
Now we regard this as a graded module over (93) concentrated in degree :
This is called the module of antifields corresponding the given type of fields encoded by .
If the field bundle is a trivial vector bundle (example 9) with field coordinates , then we write
for the vector field generator that takes derivatives along , but regarded now in degree -1.
Evaluation of vector fields in the total spacetime derivatives of the variational derivative (prop. 22) yields a linear map over (94)
If we use the volume form on spacetime to induce an identification
with respect to which the Lagrangian density decomposes as
then this is a -linear map of the form
In the special case that the field bundle is a trivial vector bundle (example 9) with field coordinates so that the Euler-Lagrange variational derivative has the coordinate expansion
then this map is given on the antifield basis elements (96) by
Consider then the graded symmetric algebra
which is generated over from the module of vector fields in degree -1.
If we think of a single vector field as a fiber-wise linear function on the cotangent bundle, and of a multivector field similarly as a multilinear function on the cotangent bundle, then we may think of this as the algebra of functions on the infinitesimal neighbourhood (example 27) of inside the graded manifold .
Let now
be the unique extension of the linear map to an -linear derivation of degree +1 on this algebra.
The resulting differential graded-commutative algebra over
is called the local BV-complex of the Lagrangian field theory at the background solution . This is the CE-algebra of the infintiesimal neighbourhood of in the derived prolonged shell (def. 120). In this case, in the absence of any explicit infinitesimal gauge symmetries, this is an example of a Koszul complex.
There are canonical homomorphisms of dgc-algebras, one from the algebra of functions on the infinitesimal neighbourhood of the background solution to the local BV-complex and from there to the local observables on the neighbourhood of the background solution (94), all considered with compact spacetime support:
such that the composite is the canonical quotient coprojection.
Similarly we obtain a factorization for the entire variational bicomplex:
where is now triply graded, with three anti-commuting differentials and .
By construction this is now such that the local observables (def. 83) are the cochain cohomology of this complex in horizontal form degree p+1, vertical degree 0 and BV-degree 0:
States
(states)
Given a Lagrangian field theory , then a (classical) state is a function from the space of observables to the complex numbers
such that
(linearity) this is a linear map;
(positivity) for any we have that
Below we consider quantum states. These are defined formally in just the same way, only that now the algebra of observables is equipped with another product, which changes the meaning of the product expression .
This concludes our discussion of observables. In the next chapter we consider the construction of the covariant phase space and of the Poisson-Peierls bracket on observables.
It might seem that with the construction of the local observables (def. 83) on the on-shell space of field histories (prop. 22) the field theory defined by a Lagrangian density (def. 60) has been completely analyzed: This data specifies, in principle, which field histories are realized, and which observable properties these have.
In particular, if the Euler-Lagrange equations of motion (def. 61) admit Cauchy surfaces (def. 87 below), i.e. spatial codimension 1 slices of spacetimes such that a field history is uniquely specified already by its restriction to the infinitesimal neighbourhood of that spatial slice, then a sufficiently complete collection of local observables whose spacetime support (def. 81) covers that Cauchy surface allows to predict the evolution of the field histories through time from that Cauchy surface.
This is all what one might think a theory of physical fields should accomplish, and in fact this is essentially all that was thought to be required of a theory of nature from about Isaac Newton’s time to about Max Planck’s time.
But we have seen that a remarkable aspect of Lagrangian field theory is that the de Rham differential of the local Lagrangian density (def. 60) decomposes into two kinds of variational differential forms (prop. 22), one of which is the Euler-Lagrange form which determines the equations of motion (47).
However, there is a second contribution: The presymplectic current (52). Since this is of horizontal degree , its transgression (def. 82) implies a further structure on the space of field histories restricted to spacetime submanifolds of dimension (i.e. of spacetime “codimension 1”). There may be such submanifolds such that this restriction to their infinitesimal neighbourhood (example 27) does not actually change the on-shell space of field histories, these are called the Cauchy surfaces (def. 87 below).
By the Hamiltonian Noether theorem (prop. 35) the presymplectic current induces infinitesimal symmetries acting on field histories and local observables, given by the local Poisson bracket (prop. 36). The transgression (def. 82) of the presymplectic current to these Cauchy surfaces yields the corresponding infinitesimal symmetry group acting on the on-shell field histories, whose Lie bracket is the Poisson bracket pairing on on-shell observables (example 70 below). This data, the on-shell space of field histories on the infinitesimal neighbourhood of a Cauchy surface equipped with infinitesimal symmetry exhibited by the Poisson bracket is called the phase space of the theory (def. 88) below.
In fact if enough Cauchy surfaces exist, then the presymplectic forms associated with any one choice turn out do agree after pullback to the full on-shell space of field histories, exhibiting this as the covariant phase space of the theory (prop. 46 below) which is hence manifestly independent of aa choice of space/time splitting. Accordingly, also the Poisson bracket on on-shell observables exists in a covariant form; for free field theories with Green hyperbolic equations of motion (def. 79) this is called the Peierls-Poisson bracket (theorem 2 below). The integral kernel for this Peierls-Poisson bracket is called the causal propagator (prop. 42). Its “normal ordered” or “positive frequency component”, called the Hadamard propagator (def. 107 below) as well as the corresponding time-ordered variant, called the Feynman propagator (def. 108 below), which we discuss in detail in Propagators below, control the causal perturbation theory for constructing perturbative quantum field theory by deforming the commutative pointwise product of on-shell observables to a non-commutative product governed to first order by the Peierls-Poisson bracket.
To see how such a deformation quantization comes about conceptually from the phase space strucure, notice from the basic principles of homotopy theory that given any structure on a space which is invariant with respect to a symmetry group acting on the space (here: the presymplectic current) then the true structure at hand is the homotopy quotient of that space by that symmetry group. We will explain this further below. This here just to point out that the homotopy quotient of the phase space by the infinitesimal symmetries of the presymplectic current is called the symplectic groupoid and that the true algebra of observables is hence the (polarized) convolution algebra of functions on this groupoid. This turns out to the “algebra of quantum observables” and the passage from the naive local observables on presymplectic phase space to this non-commutative algebra of functions on its homotopy quotient to the symplectic groupoid is called quantization. This we discuss in much detail below; for the moment this is just to motivate why the covariant phase space is the crucial construction to be extracted from a Lagrangian field theory.
We now discuss these topics:
Covariant phase space
Given a Lagrangian field theory on a spacetime (def. 60), then a Cauchy surface is a submanifold (def. 44) such that the restriction map from the on-shell space of field histories (62) to the space (63) of on-shell field histories restricted to the infinitesimal neighbourhood of (example 27) is an isomorphism:
(phase space associated with a Cauchy surface)
Given a Lagrangian field theory on a spacetime (def. 60) and given a Cauchy surface (def. 87) then the corresponding phase space is
the super smooth set (63) of on-shell field histories restricted to the infinitesimal neighbourhood of ;
equipped with the differential 2-form (as in def. 37)
which is the distributional transgression (def. 82) of the presymplectic current (def. 22) to .
This is a closed differential form in the sense of def. 37, due to prop. 44 and using that is closed by definition (52). As such this is called the presymplectic form on the phase space.
(evaluation of transgressed variational form on tangent vectors for free field theory)
Let be a Lagrangian field theory (def. 60) which is free (def. 62) hence whose field bundle is a some smooth super vector bundle (example 9) and whose Euler-Lagrange equation of motion is linear. Then the synthetic tangent bundle (def. 26) of the on-shell space of field histories (62) with spacelike compact support (def 31) is canonically identified with the Cartesian product of this super smooth set with itself
With field coordinates as in example 9, we may expand the presymplectic current as
where the components are smooth functions on the jet bundle.
Under these identifications the value of the presymplectic form (100) on two tangent vectors at a point is
(presymplectic form for free real scalar field)
Consider the Lagrangian field theory for the free real scalar field from example 39.
Under the identification of example 70 the presymplectic form on the phase space (def. 88) associated with a Cauchy surface is given by
Here the first equation follows via example 70 from the form of from example 45, while the second equation identifies the integrand as the witness for the formally self-adjointness of the Klein-Gordon equation from example 51.
(presymplectic form for free Dirac field)
Consider the Lagrangian field theory of the free Dirac field (example 43).
Under the identification of example 70 the presymplectic form on the phase space (def. 88) associated with a Cauchy surface is given by
Here the first equation follows via example 70 from the form of from example 49, while the second equation identifies the integrand as the witness for the formally self-adjointness of the Dirac equation from example 53.
Consider a Lagrangian field theory on a spacetime (def. 60).
Let
be a submanifold with two boundary components , both of which are Cauchy surfaces (def. 87).
Then the corresponding inclusion diagram
induces a Lagrangian correspondence between the associated phase spaces (def. 88)
in that the pullback of the two presymplectic forms (100) coincides on the space of field histories:
Hence there is a well defined presymplectic form
on the genuine space of field histories, given by for any Cauchy surface . This presymplectic smooth space
is therefore called the covariant phase space of the Lagrangian field theory .
By prop. 23 the total spacetime derivative of the presymplectic current vanishes on-shell:
in that the pullback (def. 10) along the shell inclusion (48) vanishes:
This implies that the transgression of to the on-shell space of field histories vanishes (since by definition (61) that involves pulling back through the shell inclusion)
But then the claim follows with prop. 44:
(polynomial Poisson bracket on covariant phase space – the Peierls bracket)
Let be a Lagrangian field theory (def. 60) such that
it is a free field theory (def. 62)
whose Euler-Lagrange equation of motion (def. 61) is
formally self-adjoint or formally anti self-adjoint (def. 58) such that
Green hyperbolic (def. 79).
Write
for the linear map from regular linear field observables (def. 80) to on-shell field histories with spatially compact support (def. 31) given under the identification (90) by the causal Green function (def. 78).
Then for every Cauchy surface (def. 87) this map is an inverse to the presymplectic form (def. 88) in that, under the identification of tangent vectors to field histories from example 70, we have that the composite
equals the evaluation map of observables on field histories.
This means that for every Cauchy surface the presymplectic form restricts to a symplectic form on regular linear observables. The corresponding Poisson bracket is
Moreover, equation (101) implies that this is the covariant Poisson bracket in the sense of the covariant phase space (def. 46) in that it does not actually depend on the choice of Cauchy surface.
An equivalent expression for the Poisson bracket that makes its independence from the choice of Cauchy surface manifest is the -Peierls bracket given by
where on the left
Hence under the given assumptions, for every Cauchy surface the Poisson bracket associated with that Cauchy surface equals the invariantly (“covariantly”) defined Peierls bracket
Finally this means that in terms of the causal propagator (84) the covariant Peierls-Poisson bracket is given in generalized function-notation by
Therefore, while the point-evaluation field observables (def. 60) are not themselves regular observables (def. 80), the Peierls-Poisson bracket (103) is induced from the following distributional bracket between them
with the causal propagator (84) on the right, in that with the identification (91) the Peierls-Poisson bracket on regular linear observables arises as follows:
Consider two more Cauchy surfaces , in the future and in the past of , respectively. Choose a partition of unity on consisting of two elements with support bounded by these Cauchy surfaces: .
Then define
by
Notice that the support of the partitioned field history is in the compactly sourced future/past cone
since is supported in the compactly sourced causal cone, but that indeed has compact support as required by (104): Since , by assumption, the support is the intersection of that of with that of , and the first is spacelike compact by assumption, while the latter is timelike compact, by definition of partition of unity.
Similarly, the equality in (105) holds because by partition of unity .
It follows that
where in the second line we chose from the two equivalent expressions (105) such that via (106) the defining property of the advanced or retarded Green function, respectively, may be applied, as shown under the braces.
Now we apply this to the computation of :
Here we computed as follows:
applied the assumption that ;
applied the above partition of unity;
used the Stokes theorem (prop. 4) for the past and the future of , respectively;
applied the definition of as the witness of the formal (anti-) self-adjointness of (def. 58);
unified the two integration domains, now that the integrands are the same;
used the formally (anti-)self adjointness of the Green functions (example 65);
used (107).
(scalar field and Dirac field have covariant Peierls-Poisson bracket)
Examples of free Lagrangian field theories for which the assumptions of theorem 2 are satisfied, so that the covariant Poisson bracket exists in the form of the Peierls bracket include
the free real scalar field (example 39);
the free Dirac field (example 43).
For the free scalar field this is the statement of example 63 with example 71, while for the Dirac field this is the statement of example 64 with example 72.
For the free electromagnetic field (example 40) the assumptions of theorem 2 are violated, the covariant phase space does not exist. But in the discuss of Gauge fixing, below, we will find that for an equivalent re-incarnation of the electromagnetic field, they are met after all.
BV-resolution of the covariant phase space
So far we have discussed the covariant phase space (prop. 46) in terms of explicit restriction to the shell. We now turn to the more flexible perspective where a homological resolution of the shell in terms of “antifields” is used (def. 85).
(BV-presymplectic current)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 24). Let be a constant section of the shell (56).
Then in the BV-variational bicomplex (98) there exists the BV-presymplectic potential
and the corresponding BV-presymplectic current
defined by
where are the given field coordinates, the corresponding antifield coordinates (96) and the corresponding components of the Euler-Lagrange form (prop. 22).
(local BV-BFV relation)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 24). Let be a constant section of the shell (56).
Then the BV-presymplectic current (def. 74) witnesses the on-shell vanishing (prop. 23) of the total spacetime derivative of the genuine presymplectic current (prop. 22) in that the total spacetime derivative of equals the BV-differential of :
Hence if is a submanifold of spacetime of full dimension with boundary
then the pullback of the two presymplectic forms (100) on the incoming and outgoing spaces of field histories, respectively, differ by the BV-differential of the transgression of the BV-presymplectic current:
This homological resolution of the Lagrangian correspondence that exhibits the “covariance” of the covariant phase space (prop. 46) is known as the BV-BFV relation (Cattaneo-Mnev-Reshetikhin 12 (9)).
For the first statement we compute as follows:
where the first steps simply unwind the definitions, and where the last step is prop. 23.
With this the second statement follows by immediate generalization of the proof of prop. 46.
=–
(derived presymplectic current of real scalar field)
Consider a Lagrangian field theory (def. 60) without any non-trivial implicit infinitesimal gauge transformations (def. \ref{ImplicitInfinitesimalGaugeSymmetry}); for instance the real scalar field from example 39.
Inside its local BV-complex (def. 85) we may form the linear combination of
the presymplectic current (example 45)
the BF-presymplectic current (example 74).
This yields a vertical 2-form
which might be called the derived presymplectic current.
Similarly we may form the linear combination of 1. the presymplectic potential current (46)
the BF-presymplectic potential current (108)
the Lagrangian density (def. 60)
hence
(where the sum of the two terms on the right is the Lepage form (53)). This might be called the derived presymplectic potental current.
We then have that
and in fact
Of course the first statement follows from the second, but in fact the two contributions of the first statement even vanish separately:
The statement on the left is immediate from the definitions, since . For the statement on the right we compute
Here the first term vanishes via the local BV-BFV relation (prop. 47) while the other two terms vanish simply by degree reasons.
Similarly for the second statement we compute as follows:
Here the direct vanishing of various terms is again by simple degree reasons, and otherwise we used the definition of and, crucially, the variational identity (46).
Hamiltonian local observables
We have defined the local observables (def. 83) as the transgressions of horizontal -forms (with compact spacetime support) to the on-shell space of field histories over all of spacetime . More explicitly, these could be called the spacetime local observables.
But with every choice of Cauchy surface (def. 87) comes another notion of local observables: those that are transgressions of horizontal -forms (instead of -forms) to the on-shell space of field histories restricted to the infinitesimal neighbourhood of that Cauchy surface (def. 42): . These are spatially local observables, with respect to the given choice of Cauchy surface.
Among these spatially local observables are the Hamiltonian local observables (def. 89 below) which are transgressions specifically of the Hamiltonian forms (def. 70). These inherit a transgression of the local Poisson bracket (prop. 36) to a Poisson bracket on Hamiltonian local observables (def. 90 below). This is known as the Peierls bracket (example 76 below).
(Hamiltonian local observables)
Let be a Lagrangian field theory (def. 60).
Consider a local observable (def. 83)
hence the transgression of a variational horizontal -form of compact spacetime support.
Given a Cauchy surface (def. 87) we say that is Hamiltonian if it is also the transgression of a Hamiltonian differential form (def. 70), hence if there exists
whose transgression over the Cauchy surface equals the transgression of over all of spacetime , under the isomorphism (99)
Beware that the local observable defined by a Hamiltonian differential form as in def. 89 does in general depend not just on the choice of , but also on the choice of the Cauchy surface. The exception are those Hamiltonian forms which are conserved currents:
(conserved charges – transgression of conserved currents)
Let be a Lagrangian field theory (def. 60).
If a Hamiltonian differential form (def. 70) happens to be a conserved current (def. 66) in that its total spacetime derivative vanishes on-shell
then the induced Hamiltonian local observable (def. 89) is independent of the choice of Cauchy surface (def 87) in that for any two Cauchy surfaces which are cobordant, then
The resulting constant is called the conserved charge of the conserved current, traditionally denoted
By definition the transgression of vanishes on the on-shell space of field histories. Therefore the result is given by Stokes' theorem (prop. 4).
(Poisson bracket of Hamiltonian local observables on covariant phase space)
Let be a Lagrangian field theory (def. 60) where the field bundle is a trivial vector bundle over Minkowski spacetime (example 9).
We say that the Poisson bracket on Hamiltonian local observables (def. 89) is the transgression (def. 82) of the local Poisson bracket (def. 36) of the corresponding Hamiltonian differential forms (def. 36) to the covariant phase space (def. 46).
Explicitly: for a choice of Cauchy surface (def. 87) then the Poisson bracket between two local Hamiltonian observables is
where on the right we have the transgression of the local Poisson bracket of Hamiltonian differential forms on the jet bundle from prop. 36.
We need to see that equation (109) is well defined, in that it does not depend on the choice of Hamiltonian form representing the local Hamiltonian observable .
It is clear that all the transgressions involved depend only on the restriction of the Hamiltonian forms to the pullback of the jet bundle to the infinitesimal neighbourhood . Moreover, the Poisson bracket on the jet bundle (76) clearly respects this restriction.
If a Hamiltonian differential form is in the kernel of the transgression map relative to , in that for every smooth collection of field histories (according to def. 37) we have (by def. 82)
then the fact that the kernel of integration is the exact differential forms says that is -exact and hence in particular -closed for all :
By prop. 20 this means that
for all . Since is horizontal, the same proposition (see also example 38) implies that in fact is horizontally closed:
Now since the field bundle is trivial by assumption, prop. 21 applies and says that this horizontally closed form on the jet bundle is in fact horizontally exact.
In conclusion this shows that the kernel of the transgression map is precisely the space of horizontally exact horizontal -forms.
Therefore the claim now follows with the statement that horizontally exact Hamiltonian differential forms constitute a Lie ideal for the local Poisson bracket on the jet bundle; this is lemma 2.
(Poisson bracket of the real scalar field)
Consider the Lagrangian field theory of the free scalar field (example 39), and consider the Cauchy surface defined by .
By example 58 the local Poisson bracket of the Hamiltonian forms
and
is
Upon transgression according to def. 90 this yields the following Poisson bracket
where
denote the point-evaluation observables (example 60), which act on a field history as
Notice that these point-evaluation functions themselves do not arise as the transgression of elements in ; only their smearings such as do. Nevertheless we may express the above Poisson bracket conveniently via the integral kernel
(super-Poisson bracket of the Dirac field)
Consider the Lagrangian field theory of the free Dirac field on Minkowski spacetime (example 43) with field bundle the odd-shifted spinor bundle (example 35) and with
the corresponding odd-graded point-evaluation observable (example 60).
Then consider the Cauchy surfaces in Minkowski spacetime (def. 23) given by for . Under transgression to this Cauchy surface via def. 90, the local Poisson bracket, which by example 59 is given by the super Lie bracket
has integral kernel
This concludes our discussion of the phase space and the Poisson-Peierls bracket for well behaved Lagrangian field theories. In the next chapter we discuss in detail the integral kernels corresponding to the Poisson-Peierls bracket for key classes of examples. These are the propagators of the theory.
In the previous chapter we have seen the covariant phase space (prop. 46) of sufficiently nice Lagrangian field theories, which is the on-shell space of field histories equipped with the presymplectic form transgressed from the presymplectic current of the theory; and we have seen that in good cases this induces a bilinear pairing on sufficiently well-behaved observables, called the Poisson bracket (def. 90), which reflects the infinitesimal symmetries of the presymplectic current. This Poisson bracket is of central importance for passing to actual quantum field theory, since, as we will discuss in Quantization below, it is the infinitesimal approximation to the quantization of a Lagrangian field theory.
We have moreover seen that the Poisson bracket on the covariant phase space of a free field theory with Green hyperbolic equations of motion – the Peierls-Poisson bracket – is determined by the integral kernel of the causal Green function (prop. 2). Under the identification of linear on-shell observables with off-shell observables that are generalized solutions to the equations of motion (theorem 1) the convolution with this integral kernel may be understood as propagating the values of an off-shell observable through spacetime, such as to then compare it with any other observable at any spacetime point (prop. 2). Therefore the integral kernel of the causal Green function is also called the causal propagator.
This means that for Green hyperbolic free Lagrangian field theory the Poisson bracket, and hence the infinitesimal quantization of the theory, is all encoded in the causal propagator. Therefore here we analyze the causal propagator, as well as its variant propagators, in detail.
The main tool for these computations is Fourier analysis (reviewed below) by which field histories, observables and propagators on Minkowski spacetime are decomposed as superpositions of plane waves of various frequencies, wave lengths and wave vector-direction. Using this, all propagators are exhibited as those superpositions of plane waves which satisfy the dispersion relation of the given equation of motion, relating plane wave frequency to wave length.
This way the causal propagator is naturally decomposed into its contribution from positive and from negative frequencies. The positive frequency part of the causal propagator is called the Hadamard propagator (def. 107 below). It turns out (prop. 69 below) that this is equivalently the sum of the causal propagator, which itself is skew-symmetric (cor. 2 below), with a symmetric component, or equivalently that the causal propagator is the skew-symmetrization of the Hadamard propagator. After quantization of free field theory discussed further below, we will see that the Hadamard propagator is equivalently the correlation function between two point-evaluation field observables (example 60) in a vacuum state of the field theory (a state in the sense of def. 86).
Moreover, by def. 78 the causal propagator also decomposes into its contributions with future and past support, given by the difference between the advanced and retarded propagators. These we analyze first, starting with prop. 64 below.
Combining these two decompositions of the causal propagator (positive/negative frequency as well as positive/negative time) yields one more propagator, the Feynman propagator (def. 108 below).
We will see below that the quantization of a free field theory is given by a “star product” (on observables) which is given by “exponentiating” these propagators. For that to make sense, certain pointwise products of these propagators, regarded as generalized functions (prop. 38) need to exist. But since the propagators are distributions with singularities, the existence of these products requires that certain potential “UV divergences” in their Fourier transforms are absent (“Hörmander's criterion”, prop. 58 below). These UV divergences are captured by what what is called the wave front set (def. 101 below).
The study of UV divergences of distributions via their wave front sets is called microlocal analysis and provides powerful tools for the understanding of quantum field theory. In particular the propagation of singularities theorem (prop. 59) shows that for distributional solutions (def. 77) of Euler-Lagrange equations of motion, such as the propagators, their singular support propagates itself through spacetime along the wave front set.
Using this theorem we work out the wave front sets of the propagators (prop. 75 below). Via Hörmander's criterion (prop. 58) this computation will serve to show why upon quantization the Hadamard propagator replaces the causal propagator in the construction of the Wick algebra of quantum observables of the free field theory (discussed below in Free quantum fields) and the the Feynman propagator similarly controls the quantum observables of the interacting quantum field theory (below in Feynman diagrams).
The following table summarizes the structure of the system of propagators. (The column “as vacuum expectation value of field operators” will be discussed further below in Free quantum fields).
propagators (i.e. integral kernels of Green functions)
for the wave operator and Klein-Gordon operator
on a globally hyperbolic spacetime such as Minkowski spacetime:
| name | symbol | wave front set | as vacuum exp. value of field operators | as a product of field operators |
|---|---|---|---|---|
| causal propagator | ![]() | Peierls-Poisson bracket | ||
| advanced propagator | ![]() | future part of Peierls-Poisson bracket | ||
| retarded propagator | ![]() | past part of Peierls-Poisson bracket | ||
| Hadamard propagator | ![]() | positive frequency of Peierls-Poisson bracket, normal-ordered product, 2-point function of vacuum state or generally of Hadamard state | ||
| Dirac propagator | ![]() | would-be time-ordered product away from coincident points | ||
| Feynman propagator | ![]() | time-ordered product |
(see also Kocic’s overview: pdf)
We now discuss these topics:
Background
Propagators for the free scalar field on Minkowski spacetime
Fourier analysis and plane wave modes
By definition, the equations of motion of free field theories (def. 62) are linear partial differential equations and hence lend themselves to harmonic analysis, where all field histories are decomposed into superpositions of plane waves via Fourier transform. Here we briefly survey the relevant definitions and facts of Fourier analysis.
In formal duality to the harmonic analysis of the field histories themselves, also the linear observables (def. 72) on the space of field histories, hence the distributional generalized functions (prop. 37) are subject to Fourier transform of distributions (def. 96 below).
Throughout, let and consider the Cartesian space of dimension (def. 1). In the application to field theory, is the dimension of spacetime and is either Minkowski spacetime (def. 23) or its dual vector space, thought of as the space of wave vectors (def. 91 below). For and we write
for the canonical pairing.
A plane wave on Minkowski spacetime (def. 23) is a smooth function with values in the complex numbers given by
for a covector, called the wave vector of the plane wave.
We use the following terminology:
plane waves on Minkowski spacetime
(Schwartz space of functions with rapidly decreasing partial derivatives)
A complex-valued smooth function is said to have rapidly decreasing partial derivatives if for all we have
Write
for the sub-vector space on the functions with rapidly decreasing partial derivatives regarded as a topological vector space for the Fréchet space structure induced by the seminorms
This is also called the Schwartz space.
(e.g. Hörmander 90, def. 7.1.2)
(compactly supported smooth function are functions with rapidly decreasing partial derivatives)
Every compactly supported smooth function (bump function) has rapidly decreasing partial derivatives (def. 92):
(pointwise product and convolution product on Schwartz space)
The Schwartz space (def. 92) is closed under the following operatios on smooth functions
pointwise product:
By the product law of differentiation.
(rapidly decreasing functions are integrable)
Every rapidly decreasing function (def. 92) is an integrable function in that its integral exists:
In fact for each the product of with the -power of the coordinate functions exists:
(Fourier transform of functions with rapidly decreasing partial derivatives)
The Fourier transform is the continuous linear functional
on the Schwartz space of functions with rapidly decreasing partial derivatives (def. 92), which is given by integration against plane wave functions (def. 91)
times the standard volume form :
Here the argument of the Fourier transform is also called the wave vector.
(e.g. Hörmander, lemma 7.1.3)
The Fourier transform (def. 93) on the Schwartz space (def. 92) is an isomorphism, with inverse function the inverse Fourier transform
given by
Hence in the language of harmonic analysis the function is the superposition of plane waves (def. 91) in which the plane wave with wave vector appears with amplitude .
(e.g. Hörmander, theorem 7.1.5)
(basic properties of the Fourier transform)
The Fourier transform (def. 93) on the Schwartz space (def. 92) satisfies the following properties, for all :
(interchanging coordinate multiplication with partial derivatives)
(interchanging pointwise multiplication with convolution product, remark 50):
(e.g Hörmander 90, lemma 7.1.3, theorem 7.1.6)
The Schwartz space of functions with rapidly decreasing partial derivatives (def. 92) serves the purpose to support the Fourier transform (def. 93) together with its inverse (prop. 94), but for many applications one needs to apply the Fourier transform to more general functions, and in fact to generalized functions in the sense of distributions (via this prop.). But with the Schwartz space in hand, this generalization is readily obtained by formal duality:
A tempered distribution is a continuous linear functional
on the Schwartz space (def. 92) of functions with rapidly decaying partial derivatives. The vector space of all tempered distributions is canonically a topological vector space as the dual space to the Schwartz space, denoted
e.g. (Hörmander 90, def. 7.1.7)
(some non-singular tempered distributions)
Every function with rapidly decreasing partial derivatives (def. 92) induces a tempered distribution (def. 95) by integrating against it:
This construction is a linear inclusion
of the Schwartz space into its dual space of tempered distributions. This is a dense subspace inclusion.
In fact already the restriction of this inclusion to the compactly supported smooth functions (example 77) is a dense subspace inclusion:
This means that every tempered distribution is a limit of a sequence of ordinary functions with rapidly decreasing partial derivatives, and in fact even the limit of a sequence of compactly supported smooth functions (bump functions).
It is in this sense that tempered distributions are “generalized functions”.
(e.g. Hörmander 90, lemma 7.1.8)
(compactly supported distributions are tempered distributions)
Every compactly supported distribution is a tempered distribution (def. 95), hence there is a linear inclusion
Write
for the distribution given by point evaluation of functions at the origin of :
This is clearly a compactly supported distribution; hence a tempered distribution by example 79.
We write just “” (without the subscript) for the corresponding generalized function (example 78), so that
(square integrable functions induce tempered distributions)
Let be a function in the th Lebesgue space, e.g. for this means that is a square integrable function. Then the operation of integration against the measure
is a tempered distribution (def. 95).
(e.g. Hörmander 90, below lemma 7.1.8)
Property (114) of the ordinary Fourier transform on functions with rapidly decreasing partial derivatives motivates and justifies the fullowing generalization:
(Fourier transform of distributions on tempered distributions)
The Fourier transform of distributions of a tempered distribution (def. 95) is the tempered distribution defined on a smooth function in the Schwartz space (def. 92) by
where on the right is the Fourier transform of functions from def. 93.
(e.g. Hörmander 90, def. 1.7.9)
(Fourier transform of distributions indeed generalizes Fourier transform of functions with rapidly decreasing partial derivatives)
Let be a non-singular tempered distribution induced, via example 78, from a function with rapidly decreasing partial derivatives .
Then its Fourier transform of distributions (def. 96) is the non-singular distribution induced from the Fourier transform of :
Let . Then
Here all equalities hold by definition, except for the third: this is property (114) from prop. 52.
(Fourier transform of Klein-Gordon equation of distributions)
Let be any tempered distribution (def. 95) on Minkowski spacetime (def. 23) and let be the Klein-Gordon operator (65). Then the Fourier transform (def. 96) of is, in generalized function-notation (remark 19)given by
Let be any function with rapidly decreasing partial derivatives (def. 92). Then
Here the first step is def. 96, the second is def. 77, the third is example 51, while the last step is prop. 52.
(Fourier transform of compactly supported distributions)
Under the identification of smooth functions of bounded growth with non-singular tempered distributions (example 78), the Fourier transform of distributions (def. 96) of a tempered distribution that happens to be compactly supported (example 79)
is simply
(Hörmander 90, theorem 7.1.14)
(Fourier transform of the delta-distribution)
The Fourier transform (def. 96) of the delta distribution (def. 80), via example 84, is the constant function on 1:
This implies by the Fourier inversion theorem (prop. 54) that the delta distribution itself has equivalently the following expression as a generalized function
in the sense that for every function with rapidly decreasing partial derivatives (def. 92) we have
which is the statement of the Fourier inversion theorem for smooth functions (prop. 94).
(Here in the last step we used change of integration variables which introduces one sign for the new volume form, but another sign from the re-orientation of the integration domain. )
Equivalently, the above computation shows that the delta distribution is the neutral element for the convolution product of distributions.
(Paley-Wiener-Schwartz theorem)
Let be a compactly supported distribution regarded as a tempered distribution by example 79. Then its Fourier transform of distributions (def. 96) is a non-singular distribution induced from a smooth function that grows at most exponentially.
(Fourier inversion theorem for Fourier transform of distributions)
The operation of forming the Fourier transform of distributions (def. 96) tempered distributions (def. 95) is an isomorphism, with inverse given by
where on the right is the ordinary inverse Fourier transform of according to prop. 94.
By def. 96 this follows immediately from the Fourier inversion theorem for smooth functions (prop. 94).
We have the following distributional generalization of the basic property (113) from prop. 52:
(Fourier transform of distributions interchanges convolution of distributions with pointwise product)
Let
be a tempered distribution (def. 95) and
be a compactly supported distribution, regarded as a tempered distribution via example 79.
Observe here that the Paley-Wiener-Schwartz theorem (prop. 53) implies that the Fourier transform of distributions of is a non-singular distribution so that the product is always defined.
Then the Fourier transform of distributions of the convolution product of distributions is the product of the Fourier transform of distributions:
(e.g. Hörmander 90, theorem 7.1.15)
(product of distributions via Fourier transform of distributions)
Prop. 55 together with the Fourier inversion theorem (prop. 54) suggests to define the product of distributions for compactly supported distributions by the formula
which would complete the generalization of of property (113) from prop. 52.
For this to make sense, the convolution product of the smooth functions on the right needs to exist, which is not guaranteed (prop. 50 does not apply here!). The condition that this exists is the Hörmander criterion on the wave front set of and . This we further discuss in Microlocal analysis and UV-Divergences below.
microlocal analysis and ultraviolet divergences
A distribution (def. 37) or generalized function (prop. 38) is like a smooth function which may have “singularities”, namely points at which it values or that of its derivatives “become infinite”. Conversely, smooth functions are the non-singular distributions (prop. 38). The collection of points around which a distribution is singular (i.e. not non-singular) is called its singular support (def. 99 below).
The Fourier transform of distributions (def. 96) decomposes a generalized function into the plane wave modes that it is made of (def. 91). The Paley-Wiener-Schwartz theorem (prop. 56 below) says that the singular nature of a compactly supported distribution may be read off from this Fourier mode decomposition: Singularities correspond to large contributions by Fourier modes of high frequency and small wavelength, hence to large “ultraviolet” (UV) contributions. Therefore the singular support of a distribution is the set of points around which the Fourier transform does not sufficiently decay “in the UV”.
But since the Fourier transform is a function of the full wave vector of the plane wave modes (def. 91), not just of the frequency/wavelength, but also of the direction of the wave vector, this means that it contains directional information about the singularities: A distribution may have UV-singularities at some point and in some wave vector direction, but maybe not in other directions.
In particular, if the distribution in question is a distributional solution to a partial differential equation (def. 77) on spacetime then the propagation of singularities theorem (prop. 59 below) says that the singular support of the solution evolves in spacetime along the direction of those wave vectors in which the Fourier transform exhibits high UV constributions. This means that these directions are the “wave front” of the distributional solution. Accordingly, the singular support of a distribution together with, over each of its points, the directions of wave vectors in which the Fourier transform around that point has large UV constributions is called the wave front set of the distribution (def. 101 below).
What is called microlocal analysis is essentially the analysis of distributions with attention to their wave front set, hence to the wave vector-directions of UV divergences.
In particular the product of distributions is well defined (only) if the wave front sets of the distributions to not “collide”. And this in fact motivates the definition of the wave front set:
To see this, let be two distributions, for simplicity of exposition taken on the real line.
Since the product , is, if it exists, supposed to generalize the pointwise product of smooth functions, it must be fixed locally: for every point there ought to be a compactly supported smooth function (bump function) with such that
But now and are both compactly supported distributions (def. 100 below), and these have the special property that their Fourier transforms and are, in particular, smooth functions (by the Paley-Wiener-Schwartz theorem, prop 53).
Moreover, the operation of Fourier transform interchanges pointwise products with convolution products (prop. 52). This means that if the product of distributions exists, it must locally be given by the inverse Fourier transform of the convolution product of the Fourier transforms and :
(Notice that the converse of this formula holds as a fact by prop. 55)
This shows that the product of distributions exists once there is a bump function such that the integral on the right converges as .
Now the Paley-Wiener-Schwartz theorem says more, it says that the Fourier transforms and are polynomially bounded. On the other hand, the integral above is well defined if the integrand decreases at least quadratically with . This means that for the convolution product to be well defined, either has to polynomially decrease faster with than grows in the other direction, (due to the minus sign in the argument of the second factor in the convolution product), or the other way around.
Moreover, the degree of polynomial growth of the Fourier transform increases by one with each derivative. Therefore if the product law for derivatives of distributions is to hold generally, we need that either or decays faster than any polynomial in the opposite of the directions in which the respective other factor does not decay.
Here the set of directions of wave vectors in which the Fourier transform of a distribution localized around any point does not decay exponentially is the wave front set of a distribution (def. 101 below). Hence the condition that the product of two distributions is well defined is that for each wave vector direction in the wave front set of one of the two distributions, the opposite direction must not be an element of the wave front set of the other distribution. This is called Hörmander's criterion (prop. 58 below).
We now say this in detail:
(restriction of distributions)
For a subset, and a distribution, then the restriction of to is the distribution
give by restricting to test functions whose support is in .
(singular support of a distribution)
Given a distribution , a point is a singular point if there is no neighbourhood of such that the restriction (def. 98) is a non-singular distribution (given by a smooth function).
The set of all singular points is the singular support of .
(product of a distribution with a smooth function)
Let be a distribution, and a smooth function. Then the product is the evident distribution given on a test function by
(Paley-Wiener-Schwartz theorem – decay of Fourier transform of compactly supported functions)
A compactly supported distribution is non-singular, hence given by a compactly supported function via , precisely if its Fourier transform (this def.) satisfies the following decay property:
For all there exists such that for all we have that the absolute value of the Fourier transform at that point is bounded by
(Hörmander 90, around (8.1.1))
Let be a distribution.
For a compactly supported smooth function, write for the corresponding product (def. 100), which is now a compactly supported distribution.
For , we say that a unit covector is regular if there exists a neighbourhood of in the unit sphere such that for all with and the decay estimate (115) is valid for the Fourier transform of ; at . Otherwise is non-regular. Write
for the set of non-regular covectors of .
The wave front set at is the intersection of these sets as ranges over bump functions whose support includes :
Finally the wave front set of is the subset of the sphere bundle which over consists of :
Often this is equivalently considered as the full conical set inside the cotangent bundle generated by the unit covectors under multiplication with positive real numbers.
(wave front set is the UV divergence-direction-bundle over the singular support)
For The Paley-Wiener-Schwartz theorem (prop. 56) implies that
Forgetting the direction covectors in the wave front set (def. 101) and remembering only the points where they are based yields the set of singlar points of , hence the singular support (def. 99)
the wave front set is empty, precisely if the singular support is empty, which is the case precisely if is a non-singular distribution.
(wave front set of delta distribution)
Consider the delta distribution
given by evaluation at the origin. Its wave front set (def. 101) consists of all the directions at the origin:
First of all the singular support (def. 99) of is clearly , hence by remark 22 the wave front set vanishes over .
At the origin, any bump function supported around the origin with satisfies and hence the wave front set over the origin is the set of covectors along which the Fourier transform does not suitably decay. But this Fourier transform is in fact a constant function (example 97) and hence does not decay in any direction.
(wave front set of step function)
Let be the Heaviside step function given by
Its wave front set (def. 101) is
(wave front set of convolution of compactly supported distributions)
Let be two compactly supported distributions. Then the wave front set (def. 101) of their convolution of distributions (def. \ref{ConvolutionOfADistributionWithACompactlySupportedDistribution}) is
(Hörmander's criterion for product of distributions)
Let be two distributions. If their wave front sets (def 101) do not collide, in that for a covector contained in one of the two wave front sets then the covector with the opposite direction in not contained in the other wave front set, i.e. the intersection fiber product inside the cotangent bundle of the pointwise sum of wave fronts with the zero section is empty:
i.e.
then the product of distributions exists, given, locally, by the Fourier inversion of the convolution product of their Fourier transform of distributions.
(symbol of a differential operator)
Let
be a differential operator on (def. 56). Then its symbol of a differential operator is the smooth function on the cotangent bundle (def. 5) given by
The principal symbol is the top degree homogeneous part .
A smooth function on the cotangent bundle (e.g. the symbol of a differential operator, def. 102 ) is of order (and type , denoted ), for , if on each coordinate chart we have that for every compact subset of the base space and all multi-indices and , there is a real number such that the absolute value of the partial derivatives of is bounded by
for all and all cotangent vectors to .
A Fourier integral operator is of symbol class if it is of the form
with symbol of order , in the above sense.
(Hörmander 71, def. 1.1.1 and first sentence of section 2.1 with (1.4.1))
(propagation of singularities theorem)
Let be a pseudo-differential operator on some smooth manifold which is properly supported (def. \ref{ProperlySupportedPseudoDifferentialOperator}) and of symbol class (def. 103) with real principal symbol that is homogeneous of degree .
For a distribution with , then the complement of the wave front set of by that of is contained in the set of covectors on which the principal symbol vanishes:
Moreover, is invariant under the bicharacteristic flow induced by the Hamiltonian vector field of with respect to the canonical symplectic manifold structure on the cotangent bundle (here).
(Duistermaat-Hörmander 72, theorem 6.1.1, recalled for instance as Radzikowski 96, theorem 4.6)
An important application of the Fourier analysis of distributions is the class of distributions known broadly as Cauchy principal values. Below we will find that these control the detailed nature of the various propagators of free field theories, notably the Feynman propagator is manifestly a Cauchy principal value (prop. 71 and def. 110 below), but also the singular support properties of the causal propagator and the Hadamard propagator are governed by Cauchy principal values (prop. 72 and prop. 73 below). This way the understanding of Cauchy principal values eventually allows us to determine the wave front set of all the propagators (prop. 75) below.
Therefore we now collect some basic definitions and facts on Cauchy principal values.
The Cauchy principal value of a function which is integrable on the complement of one point is, if it exists, the limit of the integrals of the function over subsets in the complement of this point as these integration domains tend to that point symmetrically from all sides.
One also subsumes the case that the “point” is “at infinity”, hence that the function is integrable over every bounded domain. In this case the Cauchy principal value is the limit, if it exists, of the integrals of the function over bounded domains, as their bounds tend symmetrically to infinity.
The operation of sending a compactly supported smooth function (bump function) to Cauchy principal value of its pointwise product with a function that may be singular at the origin defines a distribution, usually denoted .
(Cauchy principal value of an integral over the real line)
Let be a function on the real line such that for every positive real number its restriction to is integrable. Then the Cauchy principal value of is, if it exists, the limit
(Cauchy principal value as distribution on the real line)
Let be a function on the real line such that for all bump functions the Cauchy principal value of the pointwise product function exists, in the sense of def. 104. Then this assignment
defines a distribution .
Let be an integrable function which is symmetric, in that for all . Then the principal value integral (def. 104) of exists and is zero:
This is because, by the symmetry of and the skew-symmetry of , the the two contributions to the integral are equal up to a sign:
The Cauchy principal value distribution (def. 105) solves the distributional equation
Since the delta distribution solves the equation
we have that more generally every linear combination of the form
for , is a distributional solution to .
The wave front set of all these solutions is
The first statement is immediate from the definition: For any bump function we have that
Regarding the second statement: It is clear that the wave front set is concentrated at the origin. By symmetry of the distribution around the origin, it must contain both directions.
This follows by the characterization of extension of distributions to a point, see there at this prop. (Hörmander 90, thm. 3.2.4)
(integration against inverse variable with imaginary offset)
Write
for the distribution which is the limit in of the non-singular distributions which are given by the smooth functions as the positive real number tends to zero:
hence the distribution which sends to
(Cauchy principal value equals integration with imaginary offset plus delta distribution)
The Cauchy principal value distribution (def. 105) is equal to the sum of the integration over with imaginary offset (def. 106) and a delta distribution.
In particular, by prop. 88 this means that solves the distributional equation
Using that
we have for every bump function
Since
it is plausible that , and similarly that . In detail:
and
where we used that the derivative of the arctan function is and that is proportional to the sign function.
(Fourier integral formula for step function)
The Heaviside distribution is equivalently the following Cauchy principal value (def. 105):
where the limit is taken over sequences of positive real numbers tending to zero.
We may think of the integrand uniquely extended to a holomorphic function on the complex plane and consider computing the given real line integral for fixed as a contour integral in the complex plane.
If is positive, then the exponent
has negative real part for positive imaginary part of . This means that the line integral equals the complex contour integral over a contour closing in the upper half plane. Since has positive imaginary part by construction, this contour does encircle the pole of the integrand at . Hence by the Cauchy integral formula in the case one gets
Conversely, for the real part of the integrand decays as the negative imaginary part increases, and hence in this case the given line integral equals the contour integral for a contour closing in the lower half plane. Since the integrand has no pole in the lower half plane, in this case the Cauchy integral formula says that this integral is zero.
Conversely, by the Fourier inversion theorem, the Fourier transform of the Heaviside distribution is the Cauchy principal value as in prop. 61:
(relation to Fourier transform of Heaviside distribution / Schwinger parameterization)
The Fourier transform of distributions (def. 96) of the Heaviside distribution is the following Cauchy principal value:
Here the second equality is also known as complex Schwinger parameterization.
As generalized functions consider the limit with a decaying component:
Let now be a non-degenerate real quadratic form analytically continued to a real quadratic form
Write for the determinant of
Write for the induced quadratic form on dual vector space. Notice that (and hence ) are assumed non-degenerate but need not necessarily be positive or negative definite.
(Fourier transform of principal value of power of quadratic form)
Let be any real number, and any complex number. Then the Fourier transform of distributions of is
where
deotes the Gamma function
denotes the modified Bessel function.
Notice that diverges for as (DLMF 10.30.2).
(Gel’fand-Shilov 66, III 2.8 (8) and (9), p 289)
(Fourier transform of delta distribution applied to mass shell)
Let , then the Fourier transform of distributions of the delta distribution applied to the “mass shell” is
where denotes the modified Bessel function of order .
Notice that diverges for as (DLMF 10.30.2).
(Gel’fand-Shilov 66, III 2.11 (7), p 294)
propagators for the free scalar field on Minkowski spacetime
On Minkowski spacetime consider the Klein-Gordon operator (example 25)
By example 83 its Fourier transform is
The dispersion relation of this equation we write (see def. 91)
where on the right we choose the non-negative square root.
advanced and retarded propagators for Klein-Gordon equation on Minkowski spacetime
(mode expansion of advanced and retarded propagators for Klein-Gordon operator on Minkowski spacetime)
The advanced and retarded Green functions (def. 78) of the Klein-Gordon operator on Minkowski spacetime (example 25) are induced from integral kernels (“propagators”), hence distributions in two variables
by (in generalized function-notation, prop. 38)
where the advanced and retarded propagators have the following equivalent expressions:
Here denotes the dispersion relation (118) of the Klein-Gordon equation.
The Klein-Gordon operator is a Green hyperbolic differential operator (example 63) therefore its advanced and retarded Green functions exist uniquely (prop. 41). Moreover, prop. 42 says that they are continuous linear functionals with respect to the topological vector space structures on spaces of smooth sections (def. 73). In the case of the Klein-Gordon operator this just means that
are continuous linear functionals in the standard sense of distributions. Therefore the Schwartz kernel theorem implies the existence of integral kernels being distributions in two variables
such that, in the notation of generalized functions,
These integral kernels are the advanced/retarded “propagators”. We now compute these integral kernels by making an Ansatz and showing that it has the defining properties, which identifies them by the uniqueness statement of prop. 41.
We make use of the fact that the Klein-Gordon equation is invariant under the defnining action of the Poincaré group on Minkowski spacetime, which is a semidirect product group of the translation group and the Lorentz group.
Since the Klein-Gordon operator is invariant, in particular, under translations in it is clear that the propagators, as a distribution in two variables, depend only on the difference of its two arguments
Since moreover the Klein-Gordon operator is formally self-adjoint (this prop.) this implies that for the Klein the equation (82)
is equivalent to the equation (81)
Therefore it is sufficient to solve for the first of these two equation, subject to the defining support conditions. In terms of the propagator integral kernels this means that we have to solve the distributional equation
subject to the condition that the distributional support (def. 74) is
We make the Ansatz that we assume that , as a distribution in a single variable , is a tempered distribution
hence amenable to Fourier transform of distributions (def. 96). If we do find a solution this way, it is guaranteed to be the unique solution by prop. 41.
By example 82 the distributional Fourier transform of equation (121) is
where in the second line we used the Fourier transform of the delta distribution from example 97.
Notice that this implies that the Fourier transform of the causal propagator
satisfies the homogeneous equation:
Hence we are now reduced to finding solutions to (122) such that their Fourier inverse has the required support properties.
We discuss this by a variant of the Cauchy principal value:
Suppose the following limit of non-singular distributions in the variable exists in the space of distributions
meaning that for each bump function the limit in
exists. Then this limit is clearly a solution to the distributional equation (122) because on those bump functions which happen to be products with we clearly have
Moreover, if the limiting distribution (124) exists, then it is clearly a tempered distribution, hence we may apply Fourier inversion to obtain Green functions
To see that this is the correct answer, we need to check the defining support property.
Finally, by the Fourier inversion theorem, to show that the limit (124) indeed exists it is sufficient to show that the limit in (125) exists.
We compute as follows
where denotes the dispersion relation (118) of the Klein-Gordon equation. The last step is simply the application of Euler's formula .
Here the key step is the application of Cauchy's integral formula in the fourth step. We spell this out now for , the discussion for is the same, just with the appropriate signs reversed.
Conversely, if then we may analogously expand into the lower half plane.

Apply Cauchy's integral formula to find in the case the sum of the residues at these two poles times , zero in the other case. (For the retarded propagator we get times the residues, because now the contours encircling non-trivial poles go clockwise).
The result is now non-singular at and therefore the limit is now computed by evaluating at .
This computation shows a) that the limiting distribution indeed exists, and b) that the support of is in the future, and that of is in the past.
Hence it only remains to see now that the support of is inside the causal cone. But this follows from the previous argument, by using that the Klein-Gordon equation is invariant under Lorentz transformations: This implies that the support is in fact in the future of every spacelike slice through the origin in , hence in the closed future cone of the origin.
(causal propagator is skew-symmetric)
Under reversal of arguments the advanced and retarded causal propagators from prop. 64 are related by
It follows that the causal propagator is skew-symmetric in its arguments:
By prop. 64 we have with (119)
Here in the second step we applied change of integration variables (which introduces no sign because in addition to the integration domain reverses orientation).
(mode expansion of causal propagator for Klein-Gordon equation on Minkowski spacetime)
The causal propagator (84) for the Klein-Gordon equation for mass on Minkowski spacetime (example 25) is given, in generalized function notation, by
where in the second line we used Euler's formula .
In particular this shows that the causal propagator is real, in that it is equal to its complex conjugate
By definition and using the expression from prop. 64 for the advanced and retarded causal propagators we have
For the reality, notice from the last line that
where in the last step we used the change of integration variables (whih introduces no sign, since on top of the orientation of the integration domain changes).
We consider a couple of equivalent expressions for the causal propagator which are useful for computations:
(causal propagator for Klein-Gordon operator on Minkowski spacetime as a contour integral)
The causal propagator (prop. 42) for the Klein-Gordon equation at mass on Minkowski spacetime (example 25) has the following equivalent expression, as a generalized function, given as a contour integral along a Jordan curve going counter-clockwise around the two poles at :

graphics grabbed from Kocic 16
By Cauchy's integral formula we compute as follows:
The last line is the expression for the causal propagator from prop. 65
(causal propagator as Fourier transform of delta distribution on the Fourier transformed Klein-Gordon operator)
The causal propagator for the Klein-Gordon equation at mass on Minkowski spacetime has the following equivalent expression, as a generalized function:
where the integrand is the product of the sign function of with the delta distribution of the Fourier transform of the Klein-Gordon operator and a plane wave factor.
By decomposing the integral over into its negative and its positive half, and applying the change of integration variables we get
The last line is the expression for the causal propagator from prop. 65.
Prop. 67 exhibits the causal propagator of the Klein-Gordon operator on Minkowski spacetime as the difference of a contribution for positive temporal angular frequency (hence positive energy and a contribution of negative temporal angular frequency.
The positive frequency contribution to the causal propagator is called the Hadamard propagator (def. 107 below), also known as the the vacuum state 2-point function of the free real scalar field on Minkowski spacetime. Notice that the temporal component of the wave vector is proportional to the negative angular frequency
(see at plane wave), therefore the appearance of the step function in (129) below:
(Hadamard propagator or vacuum state 2-point function for Klein-Gordon operator on Minkowski spacetime)
The Hadamard propagator for the Klein-Gordon operator at mass on Minkowski spacetime (example 25) is the tempered distribution in two variables which as a generalized function is given by the expression
Here in the first line we have in the integrand the delta distribution of the Fourier transform of the Klein-Gordon operator times a plane wave and times the step function of the temporal component of the wave vector. In the second line we used the change of integration variables , then the definition of the delta distribution and the fact that is by definition the non-negative solution to the Klein-Gordon dispersion relation.
(e.g. Khavkine-Moretti 14, equation (38) and section 3.4)
(contour integral representation of the Hadamard propagator for the Klein-Gordon operator on Minkowski spacetime)
The Hadamard propagator from def. 107 is equivalently given by the contour integral
where the Jordan curve runs counter-clockwise, enclosing the point , but not enclosing the point .

graphics grabbed from Kocic 16
We compute as follows:
The last step is application of Cauchy's integral formula, which says that the contour integral picks up the residue of the pole of the integrand at . The last line is , by definition 107.
(skew-symmetric part of Hadamard propagator is the causal propagator)
The Hadamard propagator for the Klein-Gordon equation on Minkowski spacetime (def. 107) is of the form
where
is the causal propagator (prop. 64), which is real (128) and skew-symmetric (prop. 2)
is real and symmetric
By applying Euler's formula to (129) we obtain
On the left this identifies the causal propagator by (127), prop. 65.
The second summand changes, both under complex conjugation as well as under , via change of integration variables (because the cosine is an even function). This does not change the integral, and hence is symmetric.
We have seen that the positive frequency component of the causal propagator for the Klein-Gordon equation on Minkowski spacetime (prop. 64) is the Hadamard propagator (def. 107) given, according to prop. 69, by (131)
There is an evident variant of this combination, which will be of interest:
(Feynman propagator for Klein-Gordon equation on Minkowski spacetime)
The Feynman propagator for the Klein-Gordon equation on Minkowski spacetime (example 25) is the linear combination
where the first term is proportional to the sum of the advanced and retarded propagators (prop. 64) and the second is the symmetric part of the Hadamard propagator according to prop. 69.
Similarly the anti-Feynman propagator is
(mode expansion for Feynman propagator of Klein-Gordon equation on Minkowski spacetime)
The Feynman propagator (def. 108) for the Klein-Gordon equation on Minkowski spacetime is given by the following equivalent expressions
Similarly the anti-Feynman propagator is equivalently given by
By the mode expansion of from (119) and the mode expansion of from (132) we have
where in the second line we used Euler's formula. The last line follows by comparison with (129) and using that the integral over is invariant under .
The computation for is the same, only now with a minus sign in front of the cosine:
As before for the causal propagator, there are equivalent reformulations of the Feynman propagator which are useful for computations:
(Feynman propagator as a Cauchy principal value)
The Feynman propagator and anti-Feynman propagator (def. 108) for the Klein-Gordon equation on Minkowski spacetime is equivalently given by the following expressions, respectively:
where we have a limit of distributions as for the Cauchy principal value (this prop).
We compute as follows:
Here
In the first step we introduced the complex square root . For this to be compatible with the choice of non-negative square root for in (118) we need to choose that complex square root whose complex phase is one half that of (instead of that plus π). This means that is in the lower half plane and is in the upper half plane.
In the third step we observe that
for the integrand decays for positive imaginary part and hence the integration over may be deformed to a contour which encircles the pole in the upper half plane;
for the integrand decays for negative imaginary part and hence the integration over may be deformed to a contour which encircles the pole in the lower half plane
and then apply Cauchy's integral formula which picks out times the residue a these poles.

Notice that when completing to a contour in the lower half plane we pick up a minus signs from the fact that now the contour runs clockwise.
In the fourth step we used prop. 70.
singular support and wave front sets
We now discuss the singular support (def. 99) and the wave front sets (def. 101) of the various propagators for the Klein-Gordon equation on Minkowski spacetime.
(singular support of the causal propagator of the Klein-Gordon equation on Minkowski spacetime is the light cone)
The singular support of the causal propagator for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a generalized function in a single variable (120) is the light cone of the origin:
By prop. 67 the causal propagator is equivalently the Fourier transform of distributions of the delta distribution of the mass shell times the sign function of the angular frequency; and by basic properties of the Fourier transform this is the convolution of distributions of the separate Fourier transforms:
By prop. 63, the singular support of the first convolution factor is the light cone.
The second factor is
(by example 97 and example 90) and hence the wave front set (def. 101) of the second factor is
(by example 85 and example 88).
With this the statement follows, via a partition of unity, from this prop..
For illustration we now make this general argument more explicit in the special case of spacetime dimension
by computing an explicit form for the causal propagator in terms of the delta distribution, the Heaviside distribution and smooth Bessel functions.
We follow (Scharf 95 (2.3.18)).
Consider the formula for the causal propagator in terms of the mode expansion (127). Since the integrand here depends on the wave vector only via its norm and the angle it makes with the given spacetime vector via
we may express the integration in terms of polar coordinates as follws:
In the special case of spacetime dimension this becomes
Here in the second but last step we renamed and doubled the integration domain for convenience, and in the last step we used the trigonometric identity .
In order to further evaluate this, we parameterize the remaining components of the wave vector by the dual rapidity , via
as
which makes use of the fact that is non-negative, by construction. This change of integration variables makes the integrals under the braces above become
Next we similarly parameterize the vector by its rapidity . That parameterization depends on whether is spacelike or not, and if not, whether it is future or past directed.
First, if is spacelike in that then we may parameterize as
which yields
where in the last line we observe that the integrand is a skew-symmetric function of .
Second, if is timelike with then we may parameterize as
which yields
Here in the last line we identified the integral representation of the Bessel function of order 0 (see here). The important point here is that this is a smooth function.
Similarly, if is timelike with then the same argument yields
In conclusion, the general form of is
Therefore we end up with
(singular support of the Hadamard propagator of the Klein-Gordon equation on Minkowski spacetime is the light cone)
The singular support of the Hadamard propagator (def. 107) for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a distribution in a single variable, is the light cone of the origin:
By prop. 67 the causal propagator is equivalently the Fourier transform of distributions of the delta distribution of the mass shell times the sign function of the angular frequency; and by basic properties of the Fourier transform this is the convolution of distributions of the separate Fourier transforms:
By prop. 63, the singular support of the first convolution factor is the light cone.
The second factor is
(by example 97 and example 90 and hence the wave front set (def. 101) of the second factor is
(by example 85 and example 88).
With this the statement follows, via a partition of unity, from prop. 57.
For illustration, we now make this general statement fully explicit in the special case of spacetime dimension
by computing an explicit form for the causal propagator in terms of the delta distribution, the Heaviside distribution and smooth Bessel functions.
We follow (Scharf 95 (2.3.36)).
By (132) we have
The first summand, proportional to the causal propagator, which we computed as (136) in prop. 72 to be
The second term is computed in a directly analogous fashion: The integrals from (134) are now
Parameterizing by rapidity, as in the proof of prop. 72, one finds that for timelike this is
while for spacelike it is
where we identified the integral representations of the Neumann function (see here) and of the modified Bessel function (see here).
As for the Bessel function in (135) the key point is that these are smooth functions. Hence we conclude that
This expression has singularities on the light cone due to the step functions. In fact the expression being differentiated is continuous at the light cone (Scharf 95 (2.3.34)), so that the singularity on the light cone is not a delta distribution singularity from the derivative of the step functions. Accordingly it does not cancel the singularity of as above, and hence the singular support of is still the whole light cone.
(singular support of Feynman propagator for Klein-Gordon equation on Minkowski spacetime)
The singular support of the Feynman propagator and of the anti-Feynman propagator (def. 107) for the Klein-Gordon equation on Minkowski spacetime, regarded via translation invariance as a distribution in a single variable, is the light cone of the origin:
(e.g DeWitt 03 (27.85))
By prop. 71 the Feynman propagator is equivalently the Cauchy principal value of the inverse of the Fourier transformed Klein-Gordon operator:
With this, the statement follows immediately from prop. 62.
(wave front sets of propagators of Klein-Gordon equation on Minkowski spacetime)
The wave front set of the various propagators for the Klein-Gordon equation on Minkowski spacetime, regarded, via translation invariance, as distributions in a single variable, are as follows:
First regarding the causal propagator:
By prop. 72 the singular support of is the light cone.
Since the causal propagator is a solution to the homogeneous Klein-Gordon equation, the propagation of singularities theorem (prop. 59) says that also all wave vectors in the wave front set are lightlike. Hence it just remains to show that all non-vanishing lightlike wave vectors based on the lightcone in spacetime indeed do appear in the wave front set.
To that end, let be a bump function whose compact support includes the origin.
For a point on the light cone, we need to determine the decay property of the Fourier transform of . This is the convolution of distributions of with . By prop. 67 we have
This means that the convolution product is the smearing of the mass shell by .
Since the mass shell asymptotes to the light cone, and since for on the light cone (given that is on the light cone), this implies the claim.
Now for the Hadamard propagator:
By def. 107 its Fourier transform is of the form
Moreover, its singular support is also the light cone (prop. 73).
Therefore now same argument as before says that the wave front set consists of wave vectors on the light cone, but now due to the step function factor it must satisfy .
Finally regarding the Feynman propagator:
by prop. 70 the Feynman propagator coincides with the positive frequency Hadamard propagator for and with the “negative frequency Hadamard operator” for . Therefore the form of now follows directly with that of above.
propagators for the Dirac equation on Minkowski spacetime
We now discuss how the propagators for the free Dirac field on Minkowski spacetime (example 64) follow directly from those for the scalar field discussed above.
(advanced and retarded propagator for Dirac equation on Minkowski spacetime)
Consider the Dirac operator on Minkowski spacetime, which in Feynman slash notation reads
Its advanced and retarded propagators (def. 78) are the derivatives of distributions of the advanced and retarded propagators for the Klein-Gordon equation (prop. 64) by :
Hence the same is true for the causal propagator:
Applying a differential operator does not change the support of a smooth function, hence also not the support of a distribution. Therefore the uniqueness of the advanced and retarded propagators (prop. 41) together with the translation-invariance and the anti-formally self-adjointness of the Dirac operator (as for the Klein-Gordon operator (120) implies that it is sufficent to check that applying the Dirac operator to the yields the delta distribution. This follows since the Dirac operator squares to the Klein-Gordon operator:
Similarly we obtain the other propagators for the Dirac field from those of the real scalar field:
(Hadamard propagator for Dirac operator on Minkowski spacetime)
The Hadamard propagator for the Dirac operator on Minkowski spacetime is the positive frequency part of the causal propagator (prop. 76), hence the derivative of distributions of the Hadamard propagator for the Klein-Gordon field (def. 107) by the Dirac operator:
Here we used the expression (eq:StandardHadamardDistributionOnMinkowskiSpacetime) for the Hadamard propagator of the Klein-Gordon equation.
(Feynman propagator for Dirac operator on Minkowski spacetime)
The Feynman propagator for the Dirac operator on Minkowski spacetime is the linear combination
of the Hadamard propagator (def. 109) and the retarded propagator (prop. 76). By prop. 71 this means that it is the derivative of distributions of the Feynman propagator of the Klein-Gordon equation (def. 108) by the Dirac operator
This concludes our discussion of propagators induced from the covariant phase space of Green hyperbolic free Lagrangian field theory. These propagators will be the key in for quantization via causal perturbation theory. But not all free field theories have a covariant phase space of Green hyperbolic equations of motion, for instance the electromagnetic field, a priori, does not. Therefore before turning to quantization in the next chapter we first discuss how gauge symmetries obstruct the existence of Green hyperbolic equations of motion.
The existence of the covariant phase space (prop. 46) of a Lagrangian field theory requires the existence of Cauchy surfaces (def. 87) for its Euler-Lagrange equations of motion This the case of free field theory (def. 62) this means that the equations of motion are Green hyperbolic (def. 79).
We have seen that this is the case for instance for the scalar field (example 63) and the Dirac field (example 64), but it is not the case generally, for instance it fails for the electromagnetic field (example 46), the Yang-Mills field (example 41) and the B-field (example 42). An obstruction to the existence of the covariant phase space turns out to be (prop. 77 below) the presence of infinitesimal symmetries of the Lagrangian (def. 66) that have compact spacetime support (def. 81).
An class of examples of such are those infinitesimal symmetries of the Lagrangian which occur linearly parameterized by arbitrary sections (and their derivatives) of some vector bundle on spacetime. Because then for every choice of section of compact support the corresponding symmetry will have compact spacetime support. These parameterized infinitesimal symmetries of the Lagrangian are called infinitesimal gauge symmetries, and their parameters we call the gauge parameters (def. 111 below).
Typically all compactly supported infinitesimal symmetries of the Lagrangian arise from parameterized symmetries this way; this is notably the case for the Lagrangian density of the electromagnetic field (example 92) and more generally of the Yang-Mills field.
Therefore the presence of infinitesimal symmetries of the Lagrangian with compact spacetime support is a defect of the theory which however implies its own solution, by indicating which relations ought to be promoted to “gauge” equivalences.
This obstruction is neatly captured by the cochain cohomology of the local BV-complex (def. 85) of the Lagrangian field theory (prop. 83 below). This may be understood as the algebra of functions on an extension of the jet bundle from a (locally pro-finite dimensional, prop. 112) smooth manifold to a differential graded manifold. This appearance of homotopy theory in the guise of homological algebra in Lagrangian field theory paves the way to understanding the cause of the obstruction: It disappears when the field bundle (or more generally its jet bundle) is promoted to its infinitesimal homotopy quotient by the action of these compactly supported symmetries (the “action Lie algebroid”, def. 115 below).
Passing to this homotopy quotient means to hard-wire into the geometry of the types of field their equivalence under these symmetries: in physics this is called gauge equivalence. The result is called the “reduced phase space”, which we turn to further below.
We now discuss these topics
As an immediate corollary of prop. 34 we have the following important observation:
(spacetime-compactly supported and on-shell non-trivial infinitesimal symmetries of the Lagrangian obstruc the covariant phase space)
Let be a Lagrangian field theory over a Lorentzian spacetime.
If there exists a single infinitesimal symmetry of the Lagrangian (def. 66) such that
it has compact spacetime support (def. 81)
it does not vanish on-shell (49) (so not a trivial one, example 91)
then there does not exist any Cauchy surface (def. 87) for the Euler-Lagrange equations of motion (def. 61) outside the spacetime support of .
By prop. 34 the flow along preserves the on-shell space of field histories. Now by the assumption that does not vanish on-shell implies that this flow is non-trivial, hence that it does continuously change the field histories over some points of spacetime, while the assumption that it has compact spacetime support means that these changes are confined to a compact subset of spacetime.
This means that there is a continuum of solutions to the equations of motion whose restriction to the infinitesimal neighbourhood of any codimension-1 suface outside of this compact support coincides. Therefore this restriction map is not an isomorphism and not a Cauchy surface for the equations of motion.
Notice that there always exist spacetime-compactly supported infinitesimal symmetries that however do vanish on-shell:
(trivial implicit infinitesimal gauge symmetries)
Let be a Lagrangian field theory (def. 60) over Minkowski spacetime (def. 23), so that the Lagrangian density is canonically of the form
with Lagrangian function a smooth function of the jet bundle (characterized by prop. 19).
Then every evolutionary vector field (def. 64) whose coefficients which is proportional to the Euler-Lagrange derivative (47) of the Lagrangian function
by smooth coefficient functions
such that
each has compact spacetime support (def. 81)
is skew-symmetric in its indices:
is an implicit infinitesimal gauge symmetry (def. \ref{ImplicitInfinitesimalGaugeSymmetry}).
This is so for a “trivial reason” namely due to that that skew symmetry:
Here the first steps are just recalling those in the proof of Noether's theorem I (prop. 30) while the last step follows with the skew-symmetry of .
Notice that this means that
the Noether current (73) vanishes: ;
Therefore these implicit infinitesimal gauge symmetries are called the trivial infinitesimal gauge transformations.
(e.g. Henneaux 90, section 2.5)
infinitesimal gauge symmetries
Prop. 77 says that the problem is to identify the presence of spacetime-compactly supported infinitesimal symmetries that are on-shell non-trivial.
One way they may be identified is if infinitesimal symmetries appear in linearly parameterized collections, where the parameter itself is an arbitrary spacetime-dependent section of some fiber bundle (hence is itself like a field history), because then choosing the parameter to have compact support yields an infinitesimal gauge symmetry. In this case we speak of a gauge parameter (def. 111 below). It turns out that in most examples of Lagrangian field theories of interest, the infinitesimal gauge symmetries all come from gauge parameters this way, and often “gauge symmetry” is undertood by default to refer to this case. Therefore we now consider this case in detail.
(infinitesimal gauge symmetries)
Let be a Lagrangian field theory (def. 60).
Then a collection of infinitesimal gauge symmetries of is
a vector bundle over spacetime of positive rank, to be called a gauge parameter bundle;
a bundle morphism (def. 4) from the jet bundle of the fiber product with the field bundle (def. 54) to the vertical tangent bundle of (def. 6):
such that
is linear in the first argument (in the gauge parameter);
is an evolutionary vector field on (def. 64);
is an infinitesimal symmetry of the Lagrangian (def. 66) in the second argument.
We may express this equivalently in components in the case that the field bundle is a trivial vector bundle with field fiber coordinates (example 9) and also happens to be a trivial vector bundle
where is a vector space with coordinate functions .
Then may be expanded in the form
where the components
are smooth functions on the jet bundle of , locally of finite order (prop. 19), and such that the Lie derivative of the Lagrangian density along is a total spacetime derivative, which by Noether's theorem I (prop. 30) mean in components that
The point is that infinitesimal gauge symmetries in particular yield spacetime-compactly supported infinitesimal gauge symmetries:
(infinitesimal gauge symmetries yield spacetime-compactly supported infinitesimal symmetries of the Lagrangian)
Let be a Lagrangian field theory (def. 60) and a bundle of gauge parameters for it (def. 111) with gauge parametrization
Then for every smooth section of the gauge parameter bundle (def. 5) there is an induced infinitesimal symmetry of the Lagrangian (def. 66) given by the composition of with the jet prolongation of (def. 55)
In the components (137) this means that
where now are the actual spacetime partial derivatives of the gauge parameter section.
In particular, since is assumed to be a vector bundle, there always exists gauge parameter sections that have compact support (bump functions). For such compactly supported the infinitesimal symmetry is spacetime-compactly supported as in prop. 77.
The following is a way to identify infinitesimal gauge symmetries:
(Noether's theorem II – Noether identities)
Let be a Lagrangian field theory (def. 60) and let be a vector bundle.
The a bundle morphism of the form
a collection of infinitesimal gauge symmetries (def. 111) with local components (137)
precisely if the Euler-Lagrange form (prop. 22) satisfies the following condition:
These relations are called the Noether identities of the Euler-Lagrange equations of motion (def 61).
By Noether's theorem I, is an infinitesimal symmetry of the Lagrangian precisely if the contraction (def. 13) of with the Euler-Lagrange form (prop. 22) is horizontally exact:
From (137) this means that
where in the last step we used jet-level integration by parts to move the total spacetime derivatives off of , thereby picking up some horizontally exact correction term, as show.
This means that the term over the brace is horizontally exact:
But now the term on the left is independent of the jet coordinates of positive order , while the horizontal derivative increases the dependency on the jet order by one. Therefore the term on the left is horizontally exact precisely if it vanishes, which is the case precisely if the coefficients of vanish, which is the statement of the Noether identities.
Alternatively we may reach this conclusion from (138) by applying to both sides of (138) the Euler-Lagrange derivative (47) with respect to . On the left this yields again the coefficients of , while by the argument from example 50 it makes the right hand side vanish.
(infinitesimal gauge symmetry of electromagnetic field)
Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime from example 40. With field coordinates denoted the Lagrangian density is
where is the universal Faraday tensor from example 36.
Let be the trivial line bundle, regarded as a gauge parameter bundle (def. 23) with coordinate functions .
Then a gauge parametrized evolutionary vector field (137) is given by
with prolongation (prop. 28)
This is because already the universal Faraday tensor is invariant under this flow:
because partial derivatives commute with each other: (29).
Equivalently, the Euler-Lagrange form
of the theory (example 46), corresponding to the vacuum Maxwell equations (example 26), satisfies the following Noether identity (prop. 78):
again due to the fact that partial derivatives commute with each other.
This is the archetypical infinitesimal gauge symmetry that gives gauge theory its name.
More generally:
(infinitesimal gauge symmetry of Yang-Mills theory)
For a semisimple Lie algebra, consider the Lagrangian field theory of Yang-Mills theory on Minkowski spacetime from example 41, with Lagrangian density
given by the universal field strength
Let be the trivial vector bundle with fiber , regarded as a gauge parameter bundle (def. 23) with coordinate functions .
Then a gauge parametrized evolutionary vector field (137) is given by
with prolongation (prop. 28)
We compute the derivative of the Lagrangian function along this vector field:
Here in the third step we used that (29), so that its contraction with the skew-symmetric vanishes, and in the last step we used that for a semisimple Lie algebra is totally skew symmetric.
So the Lagrangian density of Yang-Mills theory is strictly invariant under these infinitesimal gauge symmetries.
Lie algebra action and Lie algebroids
Making the implicit infinitesimal gauge symmetries explicit means to make explicit how they act on the fields. To this end consider the general concept of an action of a Lie algebra by infinitesimal diffeomorphisms:
(action of Lie algebra by infinitesimal diffeomorphism)
Let be a smooth manifold or more generally a locally pro-manifold (prop. 19), and let be a Lie algebra.
An action of on by infinitesimal diffeomorphisms, is a homomorphism of Lie algebras
to the smooth vector fields on .
Equivalently – to bring out the relation to the gauge parameterized implicit infinitesimal gauge transformations in def. 23 – this is a -parameterized section
of the tangent bundle, such that for all pairs of points in we have
(with the Lie bracket of vector fields on the left).
(irreducible closed gauge parameters)
Let be a Lagrangian field theory (def. 60). Then a collection
of infinitesimal gauge symmetries (def. 111) is called irreducibly closed if it is closed under the Lie bracket of evolutionary vector fields (prop. 29) in that there is a unique morphism
such that
where on the left we have the Lie bracket of eolutionary vector fields from prop. 29.
(action of irreducible closed gauge parameterized implicit infinitesimal gauge symmetries on fields)
Let be a Lagrangian field theory (def. 60), and let be a bundle of irreducible closed gauge parameters for the theory (def. 23) with bundle morphism
exhibiting the corresponding parameterized implicit infinitesimal gauge symmetries.
By passing from these evolutionary vector fields (def. 64) to their prolongations , being actual vector fields on the jet bundle (prop. 28), we obtain a bundle morphism of the form
In the case that is a trivial vector bundle, with fiber , then so is its jet bundle
and so in this case the above becomes of the form
By def. 23 and def. 112 this now exhibits an action
of a Lie algebra on the jet bundle of the field bundle by infinitesimal diffeomorphisms.
We have seen that the presence of non-trivial implicit infinitesimal gauge transformations (def. \ref{ImplicitInfinitesimalGaugeSymmetry}) in a Lagrangian field theory obstructs the existence of the covariant phase space of the theory (prop. 77). But these implicit infinitesimal gauge symmetries become explicit by hard-wiring into the very geometry of the types of fields their equivalence under these symmetries: In physics this is called gauge equivalence.
Mathematically this means to pass to the infinitesimal homotopy quotient of the action of the gauge symmetries on the shell, represented by the action Lie algebroid (def. 115 below). This is called the local reduced phase space of the theory. Such “higher structures” exist in the unification of differential geometry with homotopy theory called higher differential geometry. The (“Chevalley-Eilenberg”-)algebra of functions on this “field bundle with infinitesimal gauge symmetries made explicit” is called the BRST complex. In this cochain complex the formerly implicit infinitesimal gauge symmetries appear explicitly in the guise of field variables of positive (i.e. “higher”) degree in a differential graded-commutative algebra. These are called ghost fields.
Let be an infinitesimally thickened point and write for its algebra of functions. Then a connected Lie ∞-algebroid over of finite type is a
a sequence of free modules of finite rank over , hence a graded module in degrees ;
a differential that makes the graded-commutative algebra into a cochain differential graded-commutative algebra (hence with of degree +1) over (not necessarily over ), to be called the Chevalley-Eilenberg algebra of :
If we allow to also have terms in non-positive degree, then we speak of a derived Lie algebroid. If is only concentrated in negative degrees, we also speak of a derived manifold.
With canonically itself regarded as a dgc-algebra, there is a canonical dg-algebra homomorphism
which is the identity on and zero on .
(Lie algebroids as differential graded manifolds)
Definition 114 of derived Lie algebroids is an encoding in higher algebra (homological algebra, in this case) of a situation that is usefully thought of in terms of higher differential geometry.
To see this, recall the magic algebraic properties of ordinary differential geometry (prop. 1)
embedding of smooth manifolds into formal duals of R-algebras;
embedding of smooth vector bundles into formal duals of modules
Together these imply that we may think of the graded algebra underlying a Chevalley-Eilenberg algebra as being the algebra of functions on a graded manifold
which is infinitesimal in non-vanishing degree.
The “higher” in higher differential geometry refers to the degrees higher than zero. See at Higher Structures for exposition. Specifically if has components in negative degrees, these are also called derived manifolds.
(basic examples of Lie algebroids)
Two basic examples of Lie algebroids are:
For any smooth manifold, then setting and makes it a Lie algebroid. We will just still just write for the manifold trivially regarded as a Lie alebroid this way.
For a finite dimensional Lie algebra we obtain a Lie algebroid denoted or by taking the base manifold to be the point, taking to be concentrated in degree 1 on , and taking the differential to be given by the linear dual of the Lie bracket
If is a linear basis for and a corresponding dual basis for , then this is given by
where on the right we have the structure constants of the Lie bracket in this basis:
The resulting dgc-algebra
is the standard Chevalley-Eilenberg algebra from basic Lie theory, whence the name of the general concept.
The two basic examples 95 are unified by the concept of action Lie algebroid (def. 115 below), which is the one of central relevance for the discussion of gauge theory: the local BRST complex (def. 97 below).
Given an infinitesimal action of a Lie algebra on a manifold (def. 112) the action Lie algebroid is the Lie algebroid (def. 114) whose underlying space is ; whose -module is concentrated in degree 1 on the free module and whose CE-differential is given
on functions by the Lie algebra action
on dual Lie algebra elements by the linear dual of the Lie bracket
In terms of coordinates this means the following. Assume that is a Cartesian space with coordinates and let be a linear basis for with dual basis for . Then the Lie action has components
where on the right we have the structure constants of the Lie algebra in this basis:
That the differential thus defined indeed squares to 0 is
in degree 0 the action property:
in degree 1 the Jacobi identity.
(horizontal tangent Lie algebroid)
Let be a smooth manifold or more generally a locally pro-manifold (prop. 19). Then we write for the Lie algebroid over and whose Chevalley-Eilenberg algebra is generated over in degree 1 from the module
of differential 1-forms and whose Chevalley-Eilenberg differential is the de Rham differential, so that the Chevalley-Eilenberg algebra is the de Rham dg-algebra
This is called the tangent Lie algebroid of . As a graded manifold (via remark 24) this is called the “shifted tangent bundle” of .
More generally, let be a fiber bundle over . Then there is a Lie algebroid over the jet bundle of (def. 54) defined by its Chevalley-Eilenberg algebra being the horizontal part of the variational bicomplex (def. 59):
The underlying graded manifold of is the fiber product of the jet bundle of with the shifted tangent bundle of .
There is then a canonical homomorphism of Lie algebroids (def. 116)
(local BRST complex and ghost fields for irreducible closed gauge parameters)
Let be a Lagrangian field theory (def. 60), and let be a bundle of irreducible closed gauge parameters for the theory (def. 23) with bundle morphism
Assuming that the gauge parameter bundle is trivial, , then by example 94 this induces an action of a Lie algebra on by infinitesimal diffeomorphisms.
The corresponding action Lie algebroid (def. 115) has as underlying graded manifold (remark 24)
the jet bundle of the graded field bundle
which regards the gauge parameters as fields in degree 1. As such these are called ghost fields:
Therefore we write suggestively
for the action Lie algebroid of the gauge parameterized implicit infinitesimal gauge symmetries on the jet bundle of the field bundle.
The Chevalley-Eilenberg differential of the BRST complex is traditionally denoted
To express this in coordinates, assume that the field bundle as well as the gauge parameter bundle are trivial vector bundles (example 9) with the field coordinates on the fiber of with induced jet coordinates and are ghost field coordinates on the fiber of with induced jet coordinates .
Then in terms of the corresponding coordinate expression for the gauge symmetries (137) the BRST differential is given on the fields by
and on the ghost fields by
and it extends from there, via prop. 28, to jets of fields and ghost fields by (anti-)commutativity with the total spacetime derivative.
Moreover, since the action of the infinitesimal gauge symmetries is by definition via prolongations (prop. 28) of evolutionary vector fields (def. 64) and hence compatible with the total spacetime derivative (69) this construction descends to the horizontal tangent Lie algebroid (example 96) to yield
The Chevalley-Eilenberg differential on is
The Chevalley-Eilenberg algebra of functions on this differential graded manifold (140) is called the off-shell local BRST complex (Barnich-Brandt-Henneaux 94).
We may pass from the local BRST complex on the jet bundle to the “global” BRST complex by transgression of variational differential forms (def. 82):
Write for the induced graded algebra of observables (def. 83). For with corresponding local observable its BRST differential is defined by
and extended from there to as a graded derivation.
(local BRST complex for free electromagnetic field on Minkowski spacetime)
Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime (example 40) with its gauge parameter bundle as in example 92.
By (139) the action of the BRST differential is the derivation
In particular the Lagrangian density is BRST-closed
as is the Euler-Lagrange form (due to the symmetry and in contrast to the skew-symmetry ).
This concludes our discussion of gauge symmetries as such. In the next chapter we discuss the homotopy quotient of the covariant phase space by the gauge symmetries, called the reduced phase space.
We have seen above that the covariant phase space of a Lagrangian field theory is, if it exists, the “covariant transgression” of the shell (prop. 22) equipped with the local Poisson bracket (example 36). The local observables (def. 83) which operationally define the classical field theory (or rather prequantum field theory) are functions on the covariant phase space, and the local Poisson bracket on them operationally defines the corresponding quantum field theory (below). Therefore the existence of the covariant phase space is crucial for the construction of the field theory.
However, we have then seen in prop. 77 that there may be an obstruction to the existence of the covariant phase space, namely the presence of infinitesimal gauge symmetries of the Lagrangian which have been “left implicit”. We have then discussed how to make these infinitesimal gauge symmetries “explicit” by hard-wiring their action into the geometry of the fields by passing to the corresponding infinitesimal homotopy quotient (def. 114) of (the jet bundle of) the field bundle, given by the corresponding action Lie algebroid (def. 115). Its Chevalley-Eilenberg algebra of functions is called the local BRST complex of the theory (example 97).
The corresponding covariant phase space with infinitesimal gauge symmetries made explicit is now correspondingly given by the shell (48) not inside the plain space of fields, but inside this homotopy quotient by the infinitesimal gauge symmetries. This homotopy quotient of the naive phase space by the infinitesimal gauge symmetries is called the reduced phase space. Refined to its local incarnation in the jet bundle we may call this the “derived reduced prolonged shell” (def. 120 below). Its algebra of functions is called the local BV-BRST complex of the theory.
In the next section below we find that, at least in good situations, if all non-trivial implicit infinitesimal gauge symmetries have been made explicit this way by hard-wiring their action into the geometry of the reduced phase space, then the obstruction to the existence of the covariant phase space vanishes. Hence in this case then the (perturbative) quantum field theory can exist (discussed further below). This is why we do need to pass to the reduced phase space.
In order to exhibit the key structure of the reduced phase space without getting distracted by the local jet bundle geometry, we first discuss now the simple form in which it would appear after transgression (def. 82) if spacetime were compact, so that, by the principle of extremal action (prop. 45), it would be the derived critical locus () of a globally defined action functional . This is example 100 below.
This serves as a warmup to the true construction of the derived shell in the action Lie algebroid of the jet bundle, where the action functional is “de-transgressed” to the Lagrangian density, which is invariant under gauge transformations only up to horizontally exact terms. This culminates in example 82 below.
The key to understanding the “derived reduced prolonged shell”, and hence the reduced phase space, as a derived critical locus is first to exhibit the Euler-Lagrange variation of the action functional, or rather of the Lagrangian density, as a section of the analog of a cotangent bundle, but now in the realm of Lie ∞-algebroids (prop. 80 and prop. 81 below). To this end we need to first of all consider homomorphisms of Lie algebroids:
(homomorphism between Lie algebroids)
Given two derived Lie algebroids , (def. 114), then a homomorphism between them
is a dg-algebra-homomorphism between their Chevalley-Eilenberg algebras going the other way around
such that this covers an algebra homomorphism on the function algebras (a “non-curved sh-map”)
(gauge invariant functions in terms of Lie algebroids)
Let be an action Lie algebroid (example 115) and regard the real line as a Lie algebroid by example 95. Then homomorphisms of Lie algebroids (def. 116) of the form
hence smooth functions on the Lie algebroid, are equivalently
ordinary smooth functions on the underlying smooth manifold,
which are invariant under the Lie action in that .
An -algebra homomorphism
is fixed by what it does to the canonical coordinate function on , which is taken by to . For this to be a dg-algebra homomorphism it needs to respect the differentials on both sides. Since the differential on right right is trivial, the condition is that .
Given a gauge invariant function, hence a function on a Lie algebroid (example 98), its exterior derivative should be a section of the cotangent bundle of the Lie algebroid. Moreover, if all field variations are infinitesimal (as in def. 84) then it should in fact be a section of the infinitesimal neighbourhood (example 27) of the zero section inside the cotangent bundle, the infinitesimal cotangent bundle of the Lie algebroid (def. 117 ebelow).
To motivate the definition 117 below of infinitesimal cotangent bundle of a Lie algebroid recall from example 27 that the algebra of functions on the infinitesimal cotangent bundle should be fiberwise the formal power series algebra in the linear functions. But a fiberwise linear function on a cotangent bundle is by definition a vector field. Finally observe that vector fields are equivalently derivations of smooth functions (prop. 1). This leads to the following definition:
(infinitesimal cotangent Lie algebroid)
Let be a Lie ∞-algebroid (def. 114) over some manifold . Then its infinitesimal cotangent bundle is the Lie ∞-algebroid over whose underlying graded module over is the direct sum of the original module with the derivations of the graded algebra underlying :
with differential on the summand being the original differential and on being the commutator with the differential on (which is itself a graded derivation of degree +1):
There is a canonical homomorphism of Lie algebroids (def. 116)
given dually by the identity on the original generators.
(infinitesimal cotangent bundle of action Lie algebroid)
Let be an action Lie algebroid (def. 115) where
is a Cartesian space with coordinates ;
is a Lie algebra with linear basis and corresponding structure constants ;
the infinitesimal action is given in components by
for smooth functions on .
Then the infinitesimal cotangent Lie algebroid (def. 117) has as underlying cochain complex has generators
The CE-differential on the new derivation generators is given by
and
To ease the notation one abbreviates
so that the generator content then reads
In this notation the full action of the CE-differential is therefore the following:
(exterior differential of gauge invariant function is section of infinitesimal cotangent bundle)
For an Lie ∞-algebroid (def. 114) over some ; and a gauge invariant smooth function on it (example 98); there is an induced section of the infinitesimal cotangent Lie algebroid (def. 117) bundle projection (141):
given dually
by the map which sends
the generators in to themselves;
a vector field on , regarded as a degree-0 derivation to ;
all other derivations to zero.
We discuss the proof in the special case of example 99. The general case is directly analogous.
We need to check that respects the CE-differentials.
On the original generators in this is immediate, since on these the CE-differential on both sides are by definition the same.
On the derivation we find from (143)
and on the derivation we find from (142) and using the gauge invariance of
(derived critical locus of gauge invariant function on Lie ∞-algebroid)
Let be a Lie ∞-algebroid (def. 114) over some , let
be a gauge invariant function (example 98) and consider the section of its infinitesimal cotangent bundle (def. 99) corresponding to its exterior derivative via prop. 80.
Then the derived critical locus of is the derived Lie algebroid (def. 114) to be denoted which is the homotopy pullback of the section along the zero section:
This means equivalently (details are at derived critical locus) that the Chevalley-Eilenberg algebra of is like that of the infinitesimal cotangent Lie algebroid (def. 117) except for two changes:
all derivations are shifted down in degree by one;
the CE-differential on the derivations coming from vector fields on is that of the infinitesimal cotangent Lie algebroid plus .
(archetype of the BV-BRST complex)
Consider a gauge invariant function (def. 98) on an action Lie algebroid (def. 115) for the case that the underlying manifold is a Cartesian space with global coordinates as in example 99. Then the generators of the derived critical locus (def. 118) are as in (145), except for the degree shift:
and the CE-differential is given by
which is like the differential (146) of the cotangent Lie algebroid from example 99, except for the degree-shift by -1 of the derivation generators and except for the crucial new term indicated by the underbrace.
If we think of the function as being the action functional (example 66) of a Lagrangian field theory (def. 60) over a compact spacetime , with the space of field histories (or rather an infinitesimal neighbourhood therein), hence with a Lie algebra of gauge symmetries acting on the field histories, then the Chevalley-Eilenberg algebra of the derived critical locus of is called the BV-BRST complex of the theory.
In applications of interest, the spacetime is not compact. In that case one may still appeal to a construction on the space of field histories as in example 100 by considering the action functional for all adiabatically switched Lagrangians, with . This approach is taken in (Fredenhagen-Rejzner 11a).
Here we instead consider now the “local lift” or “de-transgression” of the above construction from the space of field histories to the jet bundle of the field bundle of the theory, refining the BV-BRST complex to the local BV-BRST complex, corresponding to the local BRST complex from example 97.
This requires a slight refinement of the construction that leads to example 100: In contrast to the action functional , the Lagrangian density is not strictly invariant under implicit infinitesimal gauge transformations, in general, rather it may change up to a horizontally exact term (by the very definition \ref{ImplicitInfinitesimalGaugeSymmetry}). The same is then true for its Euler-Lagrange variational derivative . This means that is not a section of the infinitesimal cotangent bundle (def. 117) of the gauge action Lie algebroid on the jet bundle, but by a local version of it, which is twisted by horizontally exact terms.
The following definition 119 is the local refinement of def. 117:
(local infinitesimal cotangent Lie algebroid)
Let be a Lagrangian field theory (def. 60) over some spacetime , and let be a bundle of closed irreducible gauge parameters (def. 23), inducing via example 97 the Lie algebroid
whose Chevalley-Eilenberg algebra is the local BRST complex of the field theory.
Consider the case that both the field bundle (def. 34) as well as the gauge parameter bundle are trivial vector bundles (example 9) over Minkowski spacetime with field coordinates and gauge parameter coordinates .
Then the vertical infinitesimal cotangent Lie algebroid (def. 117) has coordinates as in (145) as well as all the corresponding jets and including also the horizontal differentials:
Observe that in terms of these coordinates the ordinary commutator of graded derivations has the following succinct expression:
(…)
Now consider the modification of this formula to the formula
where denotes the Euler-Lagrange variational derivative.
We define the CE-differential on functions on to be
This defines an -algebroid to be denoted
The local refinement of prop. 80 is now this:
(Euler-Lagrange form is section of local cotangent bundle of jet bundle gauge-action Lie algebroid)
Let be a Lagrangian field theory (def. 60) over some spacetime , and let be a bundle of closed irreducible gauge parameters (def. 23), inducing via example 97 the Lie algebroid and via def. 119 its local cotangent Lie ∞-algebroid .
Then the Euler-Lagrange variational derivative (prop. 22) constitutes a section of the local cotangent Lie ∞-algebroid (def. 119)
given dually
by
The proof of this proposition is a special case of the observation that the differentials involved are part of the local BV-BRST differential; this will be a direct consequence of the proof of prop. 82 below.
The local analog of def. 118 is now the following definition 120 of the “derived prolonged shell” of the theory (recall the ordinary prolonged shell from (49)):
(derived reduced prolonged shell)
Let be a Lagrangian field theory (def. 60) over some spacetime , and let be a bundle of closed irreducible gauge parameters (def. 23), inducing via prop. 81 a section of the local cotangent Lie algebroid of the jet bundle gauge-action Lie algebroid.
Then the derived prolonged shell is the derived critical locus of , hence the homotopy pullback of along the zero section of the local cotangent Lie -algebroid:
The local refinement of example 100 is now the following:
Let be a Lagrangian field theory with bundle of closed irreducible gauge parameters …
…. the Chevalley-Eilenberg algebra of the derived prolonged shell (def. 120) is the local BV-BRST complex…
By unwinding the definitions analogous to the proof of example 99, the CE-differential is given by the modified bracket of derivations (147) with the sum of the BRST-differential and the Lagrangian density:
only that in the homotopy fiber the derivations receive a degree-shift by -1 compared to their degree in .
This operation is the local BV-BRST differential by (Barnich-Henneaux 96 (2.12)-(2.13)).
(derived prolonged shell in the absence of explicit gauge symmetry – the local BV-complex)
Let be a Lagrangian field theory (def. 60) with no non-trivial infinitesimal gauge symmetries made explicit (possibly because there are none, as for the scalar field), hence with no ghost fields introduced. Then the local derived critical locus of its Lagrangian density (def. 120) is the local BV-complex of def. 85.
(local BV-complex of vacuum electromagnetism on Minkowski spacetime)
Consider the Lagrangian field theory of free electromagnetism on Minkowski spacetime (example 40) with gauge parameter as in example 92. With the field and gauge parameter coordinates as chosen in these examples
then the local BV-BRST complex (prop. 82) has generators
together with their total spacetime derivatives, and the local BV-BRST differential acts on these generators as follows:
So far the discussion yields just the algebra of functions on the derived reduced prolonged shell. We now discuss the derived analog of the full variational bicomplex (def. 59) to the derived reduced shell.
The analog of the de Rham complex of a derived Lie algebroid is called the Weil algebra:
(Weil algebra of a Lie algebroid)
Given a derived Lie algebroid over some (def. 114), its Weil algebra is
where acts as the de Rham differential on functions, and as the degree shift operator on the graded elements.
| smooth manifolds | derived Lie algebroids |
|---|---|
| algebra of functions | Chevalley-Eilenberg algebra |
| algebra of differential forms | Weil algebra |
(classical Weil algebra)
Let be a Lie algebra with corresponding Lie algebroid (example 95). Then the Weil algebra (def. 121) of is the traditional Weil algebra of from classical Lie theory.
(variational BV-bicomplex?)
Let be a Lagrangian field theory (def. 60) equipped with a closed irreducible gauge parameter bundle (def. 23). Consider the Lie algebroid from example 97, whose Chevalley-Eilenberg algebra is the local BRST complex of the theory.
Then its Weil algebra (def. 121) has as differential the variational derivative (def. 59) plus the BRST differential
Therefore we speak of the variational BRST-bicomplex? and write
Similarly, the Weil algebra of the derived prolonged shell (def. 120) has differential
Since is the BV-BRST differential (prop. 82) this defines the “variational BV-BRST-bicomplex”.
(…)
It turns out that the local BV-BRST cohomology (prop. 82) of the “derived reduced prolonged shell” very neatly captures all the aspects of Lagrangian field theory that we have been discussing so far:
(Noether theorem I in terms of local BRST cohomology)
The -closed elements in degree are precisely pairs consisting of an implicit infinitesimal local gauge symmetry and a conserved current for it.
The -exact elements in this degree are sums of
-exact currents;
on-shell vanishing implicit gauge transformations;
on-shell vanishing currents with their horizontally exact gauge transformations
(…)
The -closed element are the implicit infinitesimal gauge symmetries regarded as an antifield multiplied with the volume form together with their Noether current (prop. 30)
Such a pair is exact if
(…)
(infinitesimal gauge symmetry via local BRST cohomology)
An infinitesimal gauge symmetry of gauge parameter is a vector field on the jet bundle with components of the form
such that this is an infinitesimal symmetry of the Lagrangian in that
for all .
The corresponding anti ghost field are taken by the BV-BRST differential to the antifield-preimage of the term on the left:
Moreover, an on-shell vanishing infinitesimal symmetry of the Lagrangian is a vector field with components of the form
for a skew-symmetric system of smooth functions on the jet bundle.
The linear combination of such an infinitesimal gauge symmetry and an on-shell vanishing infinitesimal symmetry is -exact:
(Barnich-Brandt-Henneaux 94, p. 20)
It may be useful to organize this expression into the -bicomplex like so:
(…)
We had seen above that the key intermediate construction for obtaining the quantum field theory induced from a Lagrangian density is its covariant phase space (prop. 46). But then we have seen that there are generically obstructions to the existence of the covariant phase space, embodied by infinitesimal gauge transformation that have been “left implicit” (prop. 77). We have then discussed the reduced phase space above which makes the infinitesimal gauge symmetries “explicit” by forming their homotopy quotient, whose algebra of functions, on the derived shell, is the BV-BRST complex of the theory. It remains to show that this construction of the reduced phase space indeed serves to lift the obstruction to the existence of the covariant phase space. This is the topic of gauge fixing (def. … below).
The point is that while the reduced phase space reflected by the BV-BRST complex may still not be manifestly covariant, its existence as an object in homotopy theory, here specifically in homological algebra, means that it comes with a more flexible concept of “equality”, namely homotopy equivalence, which here specifically means quasi-isomorphism.
Broadly speaking, the gauge principle in physics, says that no two things (field histories, etc.) are ever really equal, instead they may be connected by gauge transformations, and the mathematical reflection of that is the principle of homotopy theory, where no two homotopy types are ever equal, instead they may be connected by (weak) homotopy equivalence. Hence picking a specific representative of a homotopy type means to fix a gauge.
Concretely, let be a Lagrangian field theory with closed irreducibe gauge parameter bundle and let be the corresponding BV-BRST complex (prop. 82).
We then ask for another field bundle , possibly itself already a graded manifold, hence an object in higher differential geometry, and then we ask for a Lagrangian density that may also genuinely live in higher prequantum geometry, hence which is defined right away on the action Lie algebroid (example 97) not necessarily descending to there from itself. We may still form the local derived critical locus of in and obtain a corresponding BV-BRST-like complex .
We ask now that has particularly good properties:
We ask that the Koszul-Tate component of has vanishing cochain cohomology in negative degree, which means by prop. 83 that the Lagrnagian on the graded field bundle is degreewise free of the obstruction to the existence of a covariant phase space.
We ask moreover that the remaining Chevalley-Eilenberg component of is compatible with the graded Poisson bracket of this graded covariant phase space
This means that the “dg-Lagrangian field theory ” induces a covariant reduced phase space “internal to” dg-manifolds; hence a “dg-covariant reduced phase space”: a graded covariant reduced phase space equipped with compatible differentials.
Such derived phase spaces are amenable to degreewise quantization (discussed below) if only one can keep the degreewise quantization compatible with the differential. This may be shown (below…) to be the case, and hence performing the quantization degreewise and passing in the end to the cochain cohomology of the resulting BV-BRST complex of quantum observables yields the gauge invariant local observables of the quantum field theory. This is called the “BV-BRST quantization of gauge theories”.
In order to apply this to the Lagrangian field theory that we actually started out to consider, we now only need to ensure that the “manifestly covariant” dg-Lagrangian field theory is not necessarily equal to , but homotopy equivalent to it, as an object in higher prequantum geometry, hence that there is a quasi-isomorphism between the corresponding BV-BRST complexes
The choice of this quasi-isomorphism hence means a choice of particularly good (namely manifestly covariant) representative of the homotopy type of , and hence this is called a gauge fixing of .
Here:
| term | meaning |
|---|---|
| “phase space” | derived critical locus of Lagrangian equipped with Poisson bracket |
| “reduced” | gauge transformations have been homotopy-quotiented out |
| “covariant” | Cauchy surfaces exist degreewise |
In practice this choice of gauge fixing by choice of quasi-isomorphism to a “manifestly covariant” BV-BRST complex is realized as the composite of two seperate quasi-isomorphisms:
an “anti-canonical transformation”
(induced by a degree -1 element called, for better or worse, the “gauge fixing fermion”) which is actually a genuine isomorphism, not just a quasi-isomorphism;
a genuine quasi-isomorphism which contracts away a contractible direct summand of auxiliary fields
(…)
(Nakanishi-Lautrup gauge fixing of vacuum electromagnetism)
Consider the local BV-BRST complex
for vacuum electromagnetism on Minkowski spacetime from example 102:
The field bundle is and the gauge parameter bundle is . The 0-jet generators are
and the differential acts as
The Lagrangian density for vacuum electromagnetism is (42)
Consider the contractible chain complex of vector bundles over
In this context is called the field bundle for the Nakanishi-Lautrup field and that for the antighost field.
The corresponding product BV-BRST complex quasi-isomorphic to the original one
has coordinate generators
We say that the Nakanishi-Lautrup gauge fixing fermion for Gaussian averaged Lorentz gauge is
With denoting the anti-Hamiltonin for the differential of the resolved local BV-BRST complex we find from (148) and (149) the antibracket
and then
Therefore the corresponding gauge fixed Lagrangian density is
(see also Henneaux 90, section 9.1)
The Euler-Lagrange equation of motion induced by this Lagrangian density (def 61) are
Here on the left we show the equations as the appear directly from the Euler-Lagrange variational derivative (prop. 22). The operator on the right is the wave operator (example 25) and denotes the divergence. The equivalence to the equations on the right follows from using in the first first equation the derivative of the second equation on the left, which is
and recalling the definition of the universal Faraday tensor (30):
The differential equations on the right are manifestly a system of normally hyperbolic differential equations, as opposed to the plain vacuum Maxwell equations on Minkowski spacetime (see also Rejzner 16, section 7.2).
(…)
Proposition 77 implies that we need a good handle on determining whether the space of implicit infinitesimal gauge symmetries modulo trivial ones is non-zero. This obstruction turns out to be neatly captured by methods of homological algebra applied to the local BV-complex (def. 85):
(cochain cohomology of local BV-complex)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 24), and let be a constant section of the shell (56).
By inspection we find that the cochain cohomology of the local BV-complex (def. 85) has the following interpretation:
In degree 0 the image of the BV-differential coming from degree -1 and modulo -exact terms
is the ideal of functions modulo that vanish on-shell. Since the differential going from degree 0 to degree 1 vanishes, the cochain cohomology in this degree is the quotient ring
of functions on the shell (94).
In degree -1 the kernel of the BV-differential going to degree 0
is the space of implicit infinitesimal gauge symmetries (def. \ref{ImplicitInfinitesimalGaugeSymmetry}) and the image of the differential coming from degree -2
is the trivial implicit infinitesimal gauge transformations (example 91).
Therefore the cochain cohomology in degree -1 is the quotient space of implicit infinitesimal gauge transformations modulo the trivial ones:
(local BV-complex is homological resolution of the shell precisely if there are no non-trivial implicit infinitesimal gauge symmetries)
Let be a Lagrangian field theory (def. 60) whose field bundle is a trivial vector bundle (example 9) and whose Lagrangian density is spacetime-independent (example 24) and let be a constant section of the shell (56). Furthermore assume that is at least quadratic in the vertical coordinates around .
Then the local BV-complex of local observables (def. 85) is a homological resolution of the algebra of functions on the infinitesimal neighbourhood of in the shell (example 24), hence the canonical comparison morphisms (98) is a quasi-isomorphism precisely if there is no non-trivial (example 91) implicit infinitesimal gauge symmetry (def. \ref{ImplicitInfinitesimalGaugeSymmetry}):
By example 107 the vanishing of non-trivial implicit infinitesimal gauge symmetries is equivalent to the vanishing of the cochain cohomology of the local BV-complex in degree -1 (151).
Therefore the statement to be proven is equivalently that the Koszul complex of the sequence of elements
is a homological resolution of , hence has vanishing cohomology in all negative degrees, already if it has vanishing cohomology in degree -1.
By a standard fact about Koszul complexes (this prop.) a sufficient condition for this to be the case is that
the ring is the tensor product of with a Noetherian ring;
the elements are contained in its Jacobson radical.
The first condition is the case since is by definition a formal power series ring over a field tensored with (by this example). Since the Jacobson radical of a power series algebra consists of those elements whose constant term vanishes (see this example), the assumption that is at least quadratic, hence that is at least linear in the fields, guarantees that all are contained in the Jacobson radical.
Prop. 83 says what gauge fixing has to accomplish: given a local BV-BRST complex we need to find a quasi-isomorphism to another complex which is such that it comes from a graded Lagrangian density whose BV-cohomology vanishes in degree -1 and hence induces a graded covariant phase space, and such that the remaining BRST differential respects the Poisson bracket on this graded covariant phase space.
(…)
Given any space with infinitesimal symmetries acting on it, there is the corresponding homotopy quotient by these infinitesimal symmetries. For the covariant phase space of a Lagrangian field theory, as above with its Poisson Lie algebra of infinitesimal symmetries (def. 90), this infinitesimal homotopy quotient is known as the Poisson Lie algebroid and the corresponding genuine homotopy quotient is known as the symplectic groupoid. As one passes from phase space to its symplectic groupoid, the algebra of functions on phase space – hence the algebra of observables (def. 83) – which is always a commutative algebra deforms to the corresponding algebra of functions on a Lie groupoid called the (polarized) convolution algebra of a Lie groupoid. This is now a non-commutative algebra called the algebra of quantum observables ; and this passage from phase space to its symplectic groupoid homotopy quotient by Hamiltonian symmetries is called quantization (specifically: “geometric quantization of symplectic groupoids”). Here the strength of the non-commutativity is measured by a deformation parameter called Planck's constant .
Since the product in the algebra of quantum observables differs from that in , the positivity condition in the definition of states of a field theory (def. 86) acquires a different meaning. The states after quantization are called quantum states and their difference witnesses that after quantization the Lagrangian field theory is of a different nature: one says that it is no longer a classical field theory, but a quantum field theory and that the objects whose states are expressed by these new quantum states are quantum fields.
Unfortunately, explicitly constructing the algebra of quantum observables of a Lagrangian field theory and hence “constructing the quantum field theory” turns out to be extremely hard, unless some simplifying assumptions are made.
One kind of simplification occurs when the spacetime dimension is very low. For instance if the spacetime dimension is taken to be – modelling the approximation where one completely ignores the variation of fields in space and retains just their time evolution – then one speaks of quantum mechanics, which is well understood. Another simplification occurs when the field theory is a free field theory, meaning that its equation of motion is a normally hyperbolic linear differential operator. In this case the quantum field theory is fully understood as long as the underlying spacetime is a time-orientable and globally hyperbolic. But, as the name indicates, this captures only the case where there is no interaction among the fields.
Since the algebra of quantum observables is a deformation with strength of the commutative algebra of classical observables controlled by the Poisson Lie algebra, another simplification occurs if one gives up on the demand to understand the full deformation at finite value of Planck's constant and considers just infinitesimal values of . Since this means that the resulting quantum observables are no longer actual smooth functions of , but just formal power series, this is called formal deformation quantization. The resulting “infinitesimally quantized” field theory is called perturbative quantum field theory.
For interacting field theories in spacetime dimension their quantization has been constructed to date only in perturbation theory this way. The construction of full non-perturbative quantum field theory (in dimension with non-vanishing interaction) is, at the time of this writing, a wide open problem.
But perturbative quantum field theory is well understood. This we turn to next…
(…)
Let be a vector space of finite dimension and let be an element of the tensor product (not necessarily skew symmetric at the moment).
We may canonically regard as a smooth manifold, in which case is canonically regarded as a constant rank-2 tensor. As such it has a canonical action by forming derivatives on the tensor product of the space of smooth functions:
If is a linear basis for , identified, as before, with a basis for , then in this basis this operation reads
where denotes the partial derivative of the smooth function along the th coordinate, and where we use the Einstein summation convention.
For emphasis we write
for the pointwise product of smooth functions.
(star product induced by constant rank-2 tensor)
Given as above, then the star product induced by on the formal power series algebra in a formal variable (“Planck's constant”) with coefficients in the smooth functions on is the linear map
given by
Hence
(star product is associative and unital)
Given as above, then the star product from def. 123 is associative and unital with unit the constant function .
Hence the vector space equipped with the star product is a unital associative algebra.
Observe that the product rule of differentiation says that
Using this we compute as follows:
In the last line we used that the ordinary pointwise product of functions is associative, and wrote for the unique pointwise product of three functions.
The last expression above is manifestly independent of the choice of order of the arguments in the triple star product, and hence it is clear that an analogous computation yields
(shift by symmetric contribution is isomorphism of star products)
Let be a vector space, a rank-2 tensor and a symmetric rank-2 tensor.
Then the linear map
constitutes an isomorphism of star product algebras (prop. 84) of the form
hence identifying the star product induced from with that induced from .
In particular every star product algebra is isomorphic to a Moyal star product algebra (example 86) with the skew-symmetric part of , this isomorphism being exhibited by (minus) the symmetric part.
We need to show that
To this end, observe that the product rule of differentiation applied twice in a row implies that
Using this we compute
Some examples of star products as in def. 123:
If in def. 123, then the star product is the plain pointwise product.
If in def. 123 is skew-symmetric, it may be regarded as a constant Poisson tensor on the smooth manifold . In this case is called a Moyal star product and the star-product algebra is called the Moyal deformation quantization of the Poisson manifold .
(…)
We discuss here the quantum observables for the special case of free field theories (def. 62). In perturbative quantum field theory this is the basis of the construction of all interacting theories in the infinitesimal neighbourhood of the free field theories.
Wick algebra and normal ordered products
(…)
To warm up, we first discuss how the star product (def. 123) of a finite dimensional vector space equipped with almost Kähler structure may be interpreted as “normal-ordered product for a single mode”:
(almost Kähler vector space)
An almost Kähler vector space is a complex vector space equipped with two bilinear forms such that with regarded as a smooth manifold and with regarded as constant tensors, then is an almost Kähler manifold.
(standard almost Kähler vector spaces)
Let equipped with the complex structure given by the canonical identification , let and . Then is an almost Kähler vector space (def. 124).
(Wick algebra of an almost Kähler vector space)
Let be an almost Kähler vector space (def. 124). Then its Wick algebra is the formal power series vector space equipped with the star product
given by the bilinear form
Here
is the ordinary (commutative) product in the formal power series algebra.
To make contact with the traditional notation we decorate the elements in the formal power series algebra with colons and declare the notation
(Wick algebra of a single mode)
Let be a standard almost Kähler vector space according to example 109, with canonical coordinates denoted and . We discuss its Wick algebra according to def. 125 and show that this reproduces the traditional definition of products of “normal ordered” operators.
To that end, consider the complex linear combination of the coordinates to the canonical complex coordinates and , which we suggestively write instead as
(with “” the traditional symbol for the amplitude of a field mode).
We find the value of the almost-Kähler forms on these elements to be
Using this, we find the star product as follows (where we write for the plain commutative product in the formal power series algebra):
These four cases are sufficient to see that in the star-product of general elements, we obtain correction term to the ordinary commutative product precisely for every pair consisting of a factor of in and a factor in . This is exactly the “normal ordering” prescription.
Now to generalize this to the infinite dimensional case of free field theory:
Let be a globally hyperbolic spacetime.
Write for the subalgebra of smooth functionals
on the smooth space of smooth functions on which is generated from those distributions on some Cartesian product whose wave front set excludes those covectors to a point in all whose components are in the future cone or all whose components are in the past cone.
(After deformation quantization below, the distributions appearing in def. 126 are the origin of “operator-valued distributions” in perturbative quantum field theory).
(regular functionals are microcausal)
Every regular functional is a microcausal functional (def. 126), since the wave front set of a distribution that is given by an ordinary function is empty:
(adiabtaically switched point interactions are microcausal)
Let be a bump function, then for the smooth functional
is a microcausal functional (def. 126).
If here we think of as a point-interaction term (as for instance in phi^4 theory) then is to be thought of as an “adiabatically switched” coupling constant. These are the relevant interaction terms to be quantized via causal perturbation theory.
For notational convenience, consider the case , the other cases are directly analogous. The distribution in question is the delta distribution
Now for and a chart around this point, the Fourier transform of restricted to this chart is proportional to the Fourier transform of evaluated at the sum of the two covectors:
Since is a plain bump function, its Fourier transform is quickly decaying (in the sense of wave front sets) with (this prop.). Thus only on the cone that function is in fact constant and in particular not decaying.
This means that the wave front set consists of the element of the form with . Since and are both in the future cone or both in the past cone precisely if , this situation is excluded in the wave front set and hence the distribution is microcausal.
(graphics grabbed from Khavkine-Moretti 14, p. 45)
This shows that microcausality in this case is related to conservation of momentum in th point interaction.
More generally:
(Hadamard-Moyal star product on microcausal functionals)
Let be a globally hyperbolic spacetime, and let be a Hadamard distribution (def. \ref{HadamardDistribution}) which is guaranteed to exist by prop. \ref{ExistenceOfHadamardDistributions}.
Then the star product
on microcausal functionals is well defined in that the products of distributions that appear in expanding out the exponential are such that the sum of the wave front sets of the factors does not intersect the zero section.
By definition of Hadamard distribution, the wave front set of powers of has all cotangents on the first variables future pointing, and all those on the second variables past pointing. The first variables are integrated against those of and the second against . By definition of microcausal functionals, the wave front sets of and are disjoint from the subsets where all components are future pointing or all components are past-pointing. Therefore the relevant sum of of the wave front covectors never vanishes.
(Wick algebra of free quantum field)
Let be a globally hyperbolic spacetime and let be a Hadamard distribution (def. \ref{HadamardDistribution}) which is guaranteed to exist by prop. \ref{ExistenceOfHadamardDistributions}.
Then the Wick algebra of quantum observables of the free scalar field on is the space of microcausal functionals (def. 126) equipped with the Hadamard-Moyal star product from prop. 87:
need to quotient out ideal of elements in the image of to go on shell
In Minkowski spacetime the Hadamard state is simply the usual vacuum state , hence the Hadamard distribution is, as a generalized function
Therefore the abstractly defined Wick algebra as in def. 127 in this case satisfies the relation
This is the traditional expression for the normal ordered Wick product on Minkowski spacetime (e.g. here).
We consider now the axioms for a perturbative S-matrix of a Lagrangian field theory as used in causal perturbation theory (def. 128 below). Since, by definition, the S-matrix is a formal sum of multi-linear continuous functionals, it is convenient to impose axioms on these directly: this is the axiomatics for time-ordered products in def. 129 below. That these latter axioms already imply the former is the statement of prop. 91 below. Its proof requires a close look at the “reverse-time ordered products” for the inverse S-matrix (def. 131 below) and their induced reverse-causal factorization (prop. 90 below).
The axioms we consider here are just the bare minimum of causal perturbation theory, sufficient to imply that the induced perturbative quantum observables organize into a causally local net of quantum observables (discussed below).
In applications one considers further axioms, in particular compatibility of the S-matrix with spacetime symmetry. This is needed for the proof of the main theorem of perturbative renormalization (see below).
Let be a Wick algebra encoding the quantization of free fields in , with
the quantization map (def. \ref{CompactlySupportedPolynomialLocalDensities}).
Then a Lagrangian S-matrix for fields of type perturbing the free fields encoded by , is a functional
(on local observables (def. \ref{CompactlySupportedPolynomialLocalDensities}) times the coupling constant or source strength with values in the algebra of formal power series in the formal variables and in the given Wick algebra) such that the following conditions hold for fixed :
(perturbation)
There exist distributions (multi- linear continuous functionals) of the form
for all , such that:
The unary operation is the quantization map
The S-matrix is the exponential of “time-ordered products” in that for
(normalization)
For all we have
Given such perturbative -matrix, then we say that the generating function (for quantum observables, see def. 132 below) that it induces is the functional
given by
Def. 128 is due to (Epstein-Glaser 73 (1)) (in view of prop. 92 below), except that these authors remain a little vague on the nature of the domain. The domain is made explicit (in terms of axioms for the time-ordered products, see def. 129 below), in (Brunetti-Fredenhagen 99, section 3, DütschFredenhagen 04, appendix E, Hollands-Wald 04, around (20)); for review see (Rejzner 16, around def. 6.7).
(further axioms)
The list of axioms in def. 128, similarly those for the time-ordered products below in def. 129, is just the bare minimum which implies that the corresponding quantum observables organize into a causally local net (discussed below). In applications such as in discussion of renormalization (below) one considers further axioms, such a unitarity and compatibility with spacetime symmetry.
(invertibility of the perturbative S-matrix)
The mutliplicative inverse of the perturbative S-matrix in def. 128 always exists: By the axioms “perturbation” and “normalization” this follows with the usual formula for the multiplicative inverse of formal power series that are non-vanishing in degree 0:
If we write
then
where the last sum does exist in because by the axiom “normalization” has vanishing coefficient in zeroth order, so that only a finite sub-sum of the formal infinite sum contributes in each order.
(intuitive interpretation of the perturbative S-matrix as a “path integral”)
In traditional informal discussion of perturbative quantum field theory, the S-matrix from def. 128 is thought of as a “path integral”, written
where the integration is thought to be over the space of field histories (“field paths”, example 16) which satisfy given asymptotic conditions at ; and as these boundary conditions vary the above is regarded as an integral kernel that defines the required operator in (e.g. Weinberg 95, around (9.3.10) and (9.4.1)).
Here the local density has the interpretation of an interaction Lagrangian density adiabatically switched by a spacetime-dependent coupling “constant”, and has the interpretation of a source field strength.
On the other hand, the kinetic or free field Lagrangian , which in the axiomatic description of def. 128 is implicit in the Wick algebra is interpreted as determining the would-be Gaussian measure “” for the path integral.
Since this measure does not actually exist, in general (or is not known to exist), we may instead think of the axioms for the S-matrix in def. 128 as rigorously defining the path integral, not as an actual integration, but “synthetically” by characterizing the behaviour of the result of the would-be integration.
See also remark 29 below.
Definition 128 suggests to focus on the multilinear operations which define the perturbative S-matix order-by-order:
Let be a Wick algebra encoding the quantization of free fields in (def. \ref{CompactlySupportedPolynomialLocalDensities}).
A time-ordered product is a sequence of distributions (multi- linear continuous functionals) of the form
for all , such that:
(perturbation)
(normalization)
(symmetry) each is symmetric in its arguments
(causal factorization) If then
(notation for time-ordered products as generalized functions)
It will be convenient (as in Epstein-Glaser 73) to think of the time-ordered products, being operator-valued distributions, as generalized functions with dependence on spacetime points:
Moreover, the subscripts on these generalized functions will always be clear from the context, so that in computations we will notationally suppress these.
Finally, due to the “symmetry” axiom in def. 129, a time-ordered product depends only on its set of arguments, not on the order of the arguments. We will write and for sets of spacetime points, and hence abbreviate the expression for the “value” of the generalized function in the above as etc.
In this condensed notation the above reads
This condensed notation turns out to be greatly simplify computations, as it absorbs all the “relative” combinatorial prefactors:
(product of perturbation series in generalized function notation)
Let
and
be power series of distributions in formal power series in as in def. 130. Then the product with expansion
is given simply by
This is because for fixed cardinality this sum over all subsets overcounts the sum over partitions of the coordinates as precisely by the binomial coefficient . Here the factor of cancels against the “global” combinatorial prefactor in the above expansion of , while the remaining factor is just the “relative” combinatorial prefactor seen at total order when expanding the product .
(the traditional error that leads to the notorious divergencies)
Naively it might seem that the time-ordered products of def. 129 are given simply by multiplication with step functions, in the notation as generalized functions (def. 130):
etc. (for instance Weinberg 95, p. 143, between (3.5.9) and (3.5.10)).
This however is simply a mathematical error, in general: Both as well as are distributions and their product of distributions is in general not defined. The notorious “divergencies which plague quantum field theory” are the signature of this ill defined operation.
On the other hand, when both distributions are restricted to the complement of the diagonal (i.e. restricted away from ) then the above expression happens to be well defined and does solve the axioms for time-ordered products.
Hence what needs to be done to properly define the time-ordered product is to choose an extension of distributions of the above expression from the complement of the diagonal to the diagonal. Any such extension will produce time-ordered products. There are in general several different such extensions. This freedom of choice is the freedom of renormalization; or equivalently, by the main theorem of perturbative renormalization theory, this is the freedom of choosing “counter terms” for the local interaction. This we discuss below in Feynman diagrams and (re-)normalization.
In order to prove that the axioms for time-ordered products do imply those for a perturbative S-matrix (prop. 91 below) we need to consider the corresponding reverse-time ordere products:
(reverse-time ordered product)
Given a time-ordered product (def. 129), its reverse-time ordered product
for is defined by
where the sum is over all unshuffles of into non-empty ordered subsequences. Alternatively, as a generalized function as in def. 130, this reads
(e.g. Epstein-Glaser 73 (11))
(reverse-time ordered products express inverse S-matrix)
Given a time-ordered products (def. 129), then the corresponding reverse time-ordered product (def. 131) expresses the inverse (according to remark 26) of the corresponding perturbative S-matrix :
By definition we have
where .
If instead of unshuffles (i.e. partitions into non-empty subsequences preserving the original order) we took partitions into arbitrarily ordered subsequences, we would be overcounting by the factorial of the length of the subsequences, and hence the above may be equivalently written as:
where denotes the symmetric group (the collection of all permutations of elements).
Moreover, since all the are equal, the sum is in fact independent of , it only depends on the length of the subsequences. Since there are permutations of elements the above reduces to
where in the last line we used (153).
In fact prop. 88 is a special case of the following more general statement:
(inversion relation for reverse-time ordered products)
Let be time-ordered products according to def. 129. Then the reverse-time ordered products according to def. 131 satisfies the following inversion relation for all (in the condensed notation of def. 130)
and
This is immediate from unwinding the definitions.
(reverse causal factorization of reverse-time ordered products)
Let be time-ordered products according to def. 129. Then the reverse-time ordered products according to def. 131 satisfies reverse-causal factorization.
(Epstein-Glaser 73, around (15))
In the condensed notation of def. 130, we need to show that for with then
We proceed by induction. If the statement is immediate. So assume that the statement is true for sets of cardinality and consider with .
We make free use of the condensed notation as in example 113.
From the formal inversion
(which uses the induction assumption that ) it follows that
Here
in the second line we used that , together with the causal factorization property of (which holds by general assumption) and that of (which holds by the induction assumption, using that hence that ).
in the third line we decomposed the sum over into two sums over subsets of and :
The first summand in the third line is the contribution where has a non-empty intersection with . This makes range without constraint, and therefore the sum in the middle vanishes, as indicated, as it is the contribution at order of the inversion formula from prop. 89
The second summand in the third line is the contribution where does not intersect . Now the sum over is the inversion formula from prop. 89 except for one term, and so it equals that term.
Using these facts about the reverse-time ordered products, we may finally prove that time-ordered products indeed do induced a perturbative S-matrix:
(time-ordered products induce perturbative S-matrix)
Let be a system of time-ordered products according to def. 129. Then
is indeed a perturbative S-matrix according to def. 128.
The axiom “perturbation” and “normalization” for the S-matrix are immediate from the corresponding axioms of the time-ordered products. What requires proof is that causal additivity of the S-matrix follows from the causal factorization property of the time-ordered products.
Notice that also the simple causal factorization property of the S-matrix
is immediate from the time-ordering axiom of the time-ordered products.
But causal additivity is stronger. It is remarkable that this, too, follows from just the time-ordering (Epstein-Glaser 73, around (73)):
To see this, first expand the generating functional (152) into powers of and
and then compare order-by-order with the given time-ordered product and its induced reverse-time ordered product (def. 131) via prop. 88. (These are also called the “generating retarded products, discussed in their own right around def. 133 below.)
In the condensed notation of def. 130 and its way of absorbing combinatorial prefactors as in example 113 this yields at order the coefficient
We claim now that the support of is inside the subset for which is in the causal past of . This will imply the claim, because by multi-linearity of it then follows that
and by prop. 92 this is equivalent to causal additivity of the S-matrix.
It remains to prove the claim:
Consider such that the subset of points not in the past of (def. 32), hence the maximal subset with
is non-empty. We need to show that in this case (in the sense of generalized functions).
Write for the complementary set of points, so that all points of are in the past of . Notice that this implies that is also not in the past of :
With this decomposition of , the sum in (154) over subsets of may be decomposed into a sum over subsets of and of , respectively. These subsets inherit the above causal ordering, so that by the causal factorization property of (def. 129) and (prop. 90) the time-ordered and reverse time-ordered products factor on these arguments:
Here the sub-sum in brackets vanishes by the inversion formula, prop. 89.
A genuine local observable should depend on the values of the fields on some compact subset of spacetime. Moreover, a perturbative quantum observable should be a power series in Planck's constant , reducing to the corresponding classical observable at . The perturbative S-matrix axiomatized above is neither localized in spacetime this way, nor is it a power series in (it is a Laurent series in ). So it is not a local observable. But the actual quantum observables on interacting fields may be expressed in terms of the S-matrix by Bogoliubov's formula (def. 132 below).
This formula is consistent in that it implies that local observables form a causally local net as their spacetime support varies (this is prop. 94 below). (On deeper grounds, this formula turns out to yield the formal Fedosov deformation quantization of the interacting field theory (Collini 16).)
Namely a key consequence of the “causal additivity” axiom on the S-matrix in def. 128 turns out to be that the perturbative quantum observables on interacting fields with compact spacetime support (def. 132)
depend on the adiabatic switching of the interaction Lagrangian density only up to canonical unitary isomorphism (prop. 92 below)
form a causally local net of observables in the sense of the Haag-Kastler axioms as the spacetime localization varies (prop. 94 below).
To the extent that a local net of observables may be regarded as defining a quantum field theory, which is the claim of (perturbative) AQFT, this proves that the perturbative S-matrices in causal perturbation theory as in def. 128 indeed make sense, despite the involvement of adiabatic switching of the interaction Lagrangian density which does not make physical sense when interpreted naively: In reality the interaction is of course not (for realistic theories at least) “switched off” outside some bounded region of spacetime; but the result here shows that if we pretend that it does then first of all we get consistent mathematical formulas and moreover we can then nevertheless compute the correct quantum observables that are localized in this spacetime region. But the local net of observables as the spacetime localization varies is supposed to encode the full quantum field theory. Certainly any given experiment in practice probes a bounded spacetime region, and hence the algebra of observables localized in this region is sufficient to compare the theory to experiment.
(perturbative quantum observables on interacting fields via Bogoliubov's formula)
Let be a perturbative S-matrix as in def. 128, and an adiabatically switched interaction Lagrangian density.
Then for a local observable, the perturbative quantum observable corresponding to is the operator-valued distribution
which is the derivative of the generating functional ((152) in def. 128) at vanishing source field:
This definition of without the adiabatic switching is originally due to Bogoliubov-Shirkov 59heory#BogoliubovShirkov59), nowadays sometimes called Bogoliubov's formula (e.g. Rejzner 16 (6.12)). The version with adiabatic switching is due to (Epstein-Glaser 73 around (74)). Review includes (Dütsch-Fredenhagen 00, around (17)).
(intuitive interpretation of Bogoliubov's formula in terms of a “path integral”)
With the perturbative S-matrix intuitively thought of as a “path integral as in remark 27
the Bogoliubov formula in def. 132 similarly would have the following heuristic interpretation:
If here we were to regard the expression
as a “complex probability measure” on the space of field histories (“field paths”), then this formula would express the expectation value of the functional under this measure:
The power series coefficients of the quantum observables on interacting fields are also called the retarded products. For the time being we mention these here just for completeness:
(retarded products induced from perturbative S-matrix)
It follows from the perturbation axiom in def. 128 that there is a system of continuous linear functionals
for all such that
Similarly there is
such that
These are called the (generating) retarded products (Glaser-Lehmann-Zimmermann 57, Epstein-Glaser 73, section 8.1).
Direct axiomatization of the retarded products is due to (Dütsch-Fredenhagen 04), see (Collini 16, section 2.2).
It is useful now to reformulate the causal additivity-property of the perturbative S-matrix in terms of the generating functions / retarded products:
(causal locality of the perturbative S-matrix)
Let be a perturbative S-matrix according to def. 128 with the generating functional (152) it induces
The following conditions are equivalent for all :
Hence causal additivity in def. 128 implies that all these conditions hold if .
If is spacelike separted from , hence if the causal ordering (def 32) is and then
Similarly, if and then
If on a causally closed subset then there exists an invertible such that for all with it relates to by conjugation:
The equivalence of the three conditions in the first statement is immediate from the definitions:
Expanding out the definition of , the first expression is equivalent to
Multiplying both sides of this equation by , shows that it is equivalent to the third clause.
Multiplying once more with this third equation is seen to be equivalent to
which is equivalently the second clause, by definition of .
Now the first clause of the first item immediately implies the first clause of the second item.
Similarly, setting and and in the third clause of the first item it reduces to
Hence if and then
which is the second clause of the second statement to be shown.
For the last statement, notice that by causal closure of the difference , which by assumption has , may, according to lemma 1, be written as
such that their causal order (def. 32) is
It follows with causal additivity and its equivalent formulations above that
and hence the last statement holds for .
We now use this fact (prop. 92) to neatly organize the system of localized quantum observables on interacting fields:
(system of perturbative generating algebras of observables)
Let be a perturbative S-matrix according to def. 128 and let be an interaction Lagrangian density.
For a causally closed subset of spacetime (def. 8) and for an adiabatic switching function (def. 33) which is constant on a neighbourhood of , write
for the smallest subalgebra of the Wick algebra which contains the generating functions for correlation functions (def. 132) of the form , for all those local observables with .
Moreover, write
be the subalgebra of the Cartesian product of all these algebras as ranges, which is generated by the tuples
for with .
Finally, for an inclusion of two causally closed subsets, let
be the algebra homomorphism which is given simply by restricting the index set of tuples.
This construction defines a functor
from the poset of causally closed subsets of spacetime to the category of star algebras.
(Brunetti-Fredenhagen 99, (65)-(67))
(algebra of observables well defined up to canonical isomorphism)
By prop. 92, for every causally closed and every the abstract algebra from def. 92 is canonically isomorphic to the subalgebra of formal power series in the Wick algebra.
Beware the slight subtlety in this statement:
The unitary elements in which exhibit the isomorphisms by conjugation are not unique, since there are many choices of splittings in the proof of prop. 92. But the induced isomorphisms between the algebras generated by the is independent of this ambiguity, since, again by the proof of prop. 92, conjugation by each such gives the same result on the given generators: .
(system of perturbative generating algebras is causally local net of observables)
Given a perturbative S-matrix according to def. 128 and an interaction Lagrangian density , then the system of generating algebras of observables (def. 134) is a causally local net of observables in that
(isotony) For every inclusion of causally closed subsets the corresponding algebra homomorphism is a monomorphism
(causal locality) For two causally closed subsets which are spacelike separated, in that their causal ordering (def. 32) satisfies
then for any further causally closed subset which contains both
then the corresponding images of the generating algebras of and , respectively, commute with each other as subalgebras of the generating algebra of :
(Dütsch-Fredenhagen 00, section 3, following Brunetti-Fredenhagen 99, section 8, Il’in-Slavnov 78)
Isotony is immediate from the definition of the algebra homomorphisms in def. 134.
Causal locality of the system of observables follows from the causal additivity of the S-matrix, by the first clause in the second statement of prop. 92.
In the same kind of way as def. 134 the actual net of algebra of perturbative quantum observables (def. 132) is defined:
(system of algebras of quantum observables)
Let be a perturbative S-matrix according to def. 128 and let be an interaction Lagrangian density.
For a causally closed subset of spacetime (def. 8) and for an compatible adiabatic switching function (def. 33) write
for the smallest subalgebra of the Wick algebra which contains the perturbative quantum observables on interacting fields (def. 132) supported in .
Moreover, let
be the subalgebra of the Cartesian product of all these algebras as ranges, which is generated by the tuples
for with .
Finally, for an inclusion of two causally closed subsets, let
be the algebra homomorphism which is given simply by restricting the index set of tuples.
This construction defines a functor
from the poset of causally closed subsets in the spacetime to the category of star algebras.
As a corollary of prop. 93 we then have the key result:
(system of algebra of perturbative quantum observables is local net of observables)
Given a perturbative S-matrix according to def. 128 and an interaction Lagrangian density , then the system of algebras of observables (def. 135) is a local net of observables in that
(isotony) For every inclusion of causally closed subsets the corresponding algebra homomorphism is a monomorphism
(causal locality) For two causally closed subsets which are spacelike separated, in that their causal ordering (def. 32) satisfies
then for any further causally closed subset which contains both
then the corresponding images of the generating algebras of and , respectively, commute with each other as subalgebras of the generating algebra of :
(Dütsch-Fredenhagen 00, below (17), following Brunetti-Fredenhagen 99, section 8, Il’in-Slavnov 78)
The first point is again immediate from the definition (def. 135).
For the second point it is sufficient to check the commutativity relation on generators. For these the statement follows with prop. 93:
for and .
So far we considered only the axioms on a consistent perturbative S-matrix /time-ordered products and its formal consequences. Now we discuss the actual construction of time-ordered products, hence of perturbative S-matrices, by the process called renormalization of Feynman diagrams.
We first discuss how time-ordered product, and hence the perturbative S-matrix above, is uniquely determined away from the locus where interaction points coincide (prop. 95 below). Moreover, we discuss how on that locus the time-ordered product is naturally expressed as a sum of products of distributions of Feynman propagators that are labeled by Feynman diagrams: the Feynman perturbation series (prop. 96 below).
This means that the full time-ordered product is an extension of distributions of these scattering amplitudes- to the locus of coinciding vertices. The space of possible such extensions turns out to be finite-dimensional in each order of , parameterizing the choice of point-supported distributions at the interaction points whose scaling degree is bounded by the given Feynman propagators.
For , write
for the subspace of the -fold tensor product of the space of compactly supported polynomial local densities (def. \ref{CompactlySupportedPolynomialLocalDensities}) on those tuples which have pairwise disjoint spacetime support.
(time-ordered product away from the diagonal)
Restricted to (def. 136) there is a unique time-ordered product (def. 129), given by the star product that is induced by the Feynman propagator
in that
Since the singular support of the Feynman propagator is on the diagonal, and since the support of elements in is by definition in the complement of the diagonal, the star product is well defined. By construction it satisfies the axioms “peturbation” and “normalization” in def. 129. The only non-trivial point to check is that it indeed satisfies “causal factorization”:
Unwinding the definition of the Hadamard state and the Feynman propagator , we have
where the propagators on the right have, in particular, the following properties:
the advanced propagator vanishes when its first argument is not in the causal past of its second argument:
the retarded propagator equals the advanced propagator with arguments switched:
is symmetric:
It follows for causal ordering (def. 32) that
and for that
This shows that is a consistent time-ordered product on the subspace of functionals with disjoint support. It is immediate from the above that it is the unique solution on this subspace.
(time-ordered product is assocativative)
Prop. 95 implies in particular that the time-ordered product is associative, in that
It follows that the problem of constructing time-ordered products, and hence (by prop. 91) the perturbative S-matrix, consists of finding compatible extension of the distribution to the diagonal.
Moreover, by the nature of the exponential expression, this means in each order to extend products of Feynman propagators labeled by graphs whose vertices correspond to the polynomial factors in and and whose edges indicate over which variables the Feynman propagators are to be multiplied.
(scalar field Feynman diagram)
A scalar field Feynman diagram is
a natural number (number of vertices);
a -tuple of elements (the interaction and external field vertices)
for each a natural number (“of edges from the th to the th vertex”).
For a given tuple of interaction vertices we write
for set of scalar field Feynman diagrams with that tuple of vertices.
(Feynman perturbation series away from coinciding vertices)
For the -fold time-ordered product away from the diagonal, given by prop. 95
is equal to
where the edge numbers are those of the given Feynman diagram .
We proceed by induction over the number of vertices. The statement is trivially true for a single vertex. Assume it is true for vertices. It follows that
Here in the first step we use the associativity of the time-ordered product (remark 31), in the second step we use the induction assumption, in the third we pass the outer functional derivatives through the pointwise product using the product rule, and in the fourth step we recognize that this amounts to summing in addition over all possible choices of sets of edges from the first vertices to the new st vertex, which yield in total the sum over all diagrams with vertices.
(loop order and powers of Planck's constant)
From prop. 96 one deduces that the order in Planck's constant that a (planar) Feynman diagram contributes to the S-matrix is given (up to a possible offset due to external vertices) by the “number of loops” in the diagram.
In the computation of scattering amplitudes for fields/particles via perturbative quantum field theory the scattering matrix (Feynman perturbation series) is a formal power series in (the coupling constant and) Planck's constant whose contributions may be labeled by Feynman diagrams. Each Feynman diagram is a finite labeled graph, and the order in to which this graph contributes is
where
This comes about, according to the above, because
the explicit -dependence of the S-matrix is
the further -dependence of the time-ordered product is
where denotes the Feynman propagator and the field observable at point (where we are notationally suppressing the internal degrees of freedom of the fields for simplicity, writing them as scalar fields, because this is all that affects the counting of the powers).
The resulting terms of the S-matrix series are thus labeled by
the number of factors of the interaction , these are the vertices of the corresponding Feynman diagram and hence each contibute with
the number of integrals over the Feynman propagator , which correspond to the edges of the Feynman diagram, and each contribute with .
Now the formula for the Euler characteristic of planar graphs says that the number of regions in a plane that are encircled by edges, the faces here thought of as the number of “loops”, is
Hence a planar Feynman diagram contributes with
So far this is the discussion for internal edges. An actual scattering matrix element is of the form
where
is a state of free field quanta and similarly
is a state of field quanta. The normalization of these states, in view of the commutation relation , yields the given powers of .
This means that an actual scattering amplitude given by a Feynman diagram with external vertices scales as
(For the analogous discussion of the dependence on the actual quantum observables on given by Bogoliubov's formula, see there.)
…
(…)